wordwatching

View Original

Eggcorns in tweets

On Saturday, I posted on the topic of eggcorns in a general way and followed up on that discussion yesterday with an analysis of eggcorns in pop music. Today, I look at eggcorns in social media, using a large sample of tweets culled from Twitter by the folks at Illocution Inc as my corpus. From the beginning of this exercise, I figured that an investigation of eggcorns in tweets would be more productive than an examination of their use in lyrics for three reasons:  1) because Twitter relies on writing as the means of expression, and many of the eggcorns on the list I have compiled are writing-based; 2) the collection I am looking at is relatively large (compared to, for instance, my collection of lyrics), comprising about 1.2 million tweets; and 3) because Twitter. As the platform encourages users to create and post responses in a swift and timely manner, and because language in social media generally tends to bridge the gap between speech and writing in a way that traditional print publications do not, it stands to reason that Twitter would be fertile ground for the growth and, hence, the harvesting of eggcorns. Today's post will cover the results of running about half of my list of nearly 700 eggcorns -- those beginning with letters A through 0 -- on a sample comprising 10% of the English-language tweets posted to Twitter in 2013, and a later post will present results using the second half of the eggcorn list (P-Z) on the same collection of tweets.

So what did my expedition find? Of the 365 eggcorns tested, 36 (or about 10%) had at least one positive hit in the collection of tweets. Starting with the hapax legomena, or oncers, the following eggcorns were found:

Fig. 1: A sample of eggcorn oncers in the Twitter collection (Lamont Antieau, wordwatching.org)

Other low-frequency items included the following:

Fig. 2: A sample of other low-frequency eggcorns in the Twitter collection (Lamont Antieau, wordwatching.org)

Another item that is often considered an eggcorn and makes an infrequent appearance in the Twitter collection I'm using is duck tape (4 times, including "I'm not putting duck tape on my body"); however, as the etymology and relationship of duck tape and duct tape are unclear, it is difficult to ascertain which should rightfully be called the eggcorn of the other. (Duct tape appears in the collection 17 times.)

Bare for bear occurs 13 times in the set, including in "grin and bare it", "a right to bare arms", and "can't bare clothes that say 'Beast Mode' on them". Every since for ever since occurs 18 times in the collection, as in the tweet "Every since I got a new number".

The two eggcorns on the list that appear most frequently in the Twitter collection are on route for enroute and could of for could've, which appear 75 times and 82 times, respectively. Added to the number of times that in route was used (n=3), and the 78 times that in route/on route appear in the collection overshadow the 8 appearances of enroute; however, could've/could have weigh in at 670 and thus retain their hold over the challenger could of and its 75 appearances. (Apologies for shifting from the harvesting metaphor I usually use with eggcorns to the boxing metaphor, but when I think of could of, I think of coulda (n=99), which inevitable leads me to think of the following scene.)

In the next post, we will see how should've and would've are doing against the contenders should of and would of, and take a look at the performance of the remaining eggcorns on the list in the Twitter collection.

See this content in the original post