You'll never guess where we're from - "eh?"
Social networks create their own regional dialects
January 10, 2011
By Joel Shurkin People "tweeting" mundane 140-character messages on Twitter are creating evolving regional dialects just like those of spoken language. The dialects are so pronounced that senders in the US can be located just by marking the words they use.
Regional dialects are one of the joys of spoken English. In the US, "y'all" marks someone as a southerner; "cab" a New Yorker. Depending on where you live, a long sandwich can be a "hero," a "sub" or a "hoagie." But now those peculiarities are also evolving in social media.
Jacob Eisenstein, a post-doctoral fellow at Carnegie Mellon University in Pittsburgh, Pennsylvania, and his colleagues used a Twitter service called Garden Hose that collects tweets to analyse a week's worth of the messages posted by cellphone in the US last March. All the tweeters had turned on their phones' GPS system, so the messages were geotagged.
The researchers found that if you are cool in the San Francisco area, you will probably write "koo" on Twitter, but in southern California, you write "coo." You are "hella" tired in northern California, "deadass" tired in New York, and in Los Angeles you use an acronym for an obscenity. Other words can also give away location, including references to sports teams and rock bands.
Thousands of tweets
To eliminate commercial messages and spam, the researchers looked only at people who had written at least 20 messages that month, had fewer than 1,000 followers and who followed fewer than 1,000 others. They came up with 9,500 users and 380,000 messages using 4.7 million words, with a total vocabulary of 5,216 words, and then went looking for regionalisms in the tweets.
When they compared what the words told them with the locations tagged in the messages, the geographic touchstones "are correct to within 300 miles," Eisenstein said.
Giving yourself away
This, of course, raises privacy issues: if you are giving away your location by your messaging, you are providing useful commercial data someone can use or sell.
Susannah Fox, associate director of digital strategy for the Pew Internet & American Life Project, points out that only 4 per cent of internet users say they reveal their specific locations by using services like Foursquare or Gowalla, but this new research shows that people's general whereabouts may be found out by analysing what they talk about on Twitter – or even how they talk.
"The limitation of this study, of course, is that only 8 per cent of online adults say they use Twitter," says Fox.
The researchers reported their findings on Friday to a meeting of the Linguistic Society of America in Pittsburgh, Pennsylvania.
January 10, 2011
By Joel Shurkin People "tweeting" mundane 140-character messages on Twitter are creating evolving regional dialects just like those of spoken language. The dialects are so pronounced that senders in the US can be located just by marking the words they use.
Regional dialects are one of the joys of spoken English. In the US, "y'all" marks someone as a southerner; "cab" a New Yorker. Depending on where you live, a long sandwich can be a "hero," a "sub" or a "hoagie." But now those peculiarities are also evolving in social media.
Jacob Eisenstein, a post-doctoral fellow at Carnegie Mellon University in Pittsburgh, Pennsylvania, and his colleagues used a Twitter service called Garden Hose that collects tweets to analyse a week's worth of the messages posted by cellphone in the US last March. All the tweeters had turned on their phones' GPS system, so the messages were geotagged.
The researchers found that if you are cool in the San Francisco area, you will probably write "koo" on Twitter, but in southern California, you write "coo." You are "hella" tired in northern California, "deadass" tired in New York, and in Los Angeles you use an acronym for an obscenity. Other words can also give away location, including references to sports teams and rock bands.
Thousands of tweets
To eliminate commercial messages and spam, the researchers looked only at people who had written at least 20 messages that month, had fewer than 1,000 followers and who followed fewer than 1,000 others. They came up with 9,500 users and 380,000 messages using 4.7 million words, with a total vocabulary of 5,216 words, and then went looking for regionalisms in the tweets.
When they compared what the words told them with the locations tagged in the messages, the geographic touchstones "are correct to within 300 miles," Eisenstein said.
Giving yourself away
This, of course, raises privacy issues: if you are giving away your location by your messaging, you are providing useful commercial data someone can use or sell.
Susannah Fox, associate director of digital strategy for the Pew Internet & American Life Project, points out that only 4 per cent of internet users say they reveal their specific locations by using services like Foursquare or Gowalla, but this new research shows that people's general whereabouts may be found out by analysing what they talk about on Twitter – or even how they talk.
"The limitation of this study, of course, is that only 8 per cent of online adults say they use Twitter," says Fox.
The researchers reported their findings on Friday to a meeting of the Linguistic Society of America in Pittsburgh, Pennsylvania.
0 Comments:
Post a Comment
<< Home