Suppose we want to find out how discourse in The New York Times has evolved over a period of time. These days it would be neat to analyze this, since whenever we visit the front page we get firehosed with news about the coronavirus. How has the pandemic shaped the headlines of one of the most popular newspapers in the United States?
To answer a question like this, we would need to collect article metadata from The New York Times. Here I describe how to do this in Python.
Table of Contents:
When you Google how to do this, all these companies pop up, offering to do this for you if you pay them a small fee. But Twitter gives you all the tools you need to do it yourself—for free. This short guide takes you through a Python script that helps you to use those tools.
At the end of this guide, you get a script that downloads a CSV containing all the followers and/or friends for any public Twitter handle. This file will also contain the following details about each person:
When classes moved online in 2020, academic institutions across the country watched as the rate of cheating soared. It’s never fun dealing with plagiarism, but it is important to detect it, regardless of where we stand in the academic debate on how to best handle this type of cheating during a pandemic, as well as in general.
The best tool that helps you to do this for free is Stanford’s MOSS. This tutorial provides a quick route to setting it up, to help you hit the ground running.
Table of Contents:
A few days after the WHO formally declared the COVID-19 pandemic in March, some friends and I began collecting memes about it in a public Facebook album.
Initially, I shared the memes just for laughs—like, look at this absurd, unprecedented situation we’ve ended up in, we’re being told to stay home!—but as lockdown dragged on and on and on, the memes became a source of comic relief and distraction, helped to reduce the pains of social distancing, and often said aloud what we were all thinking.
Over the months, the album unexpectedly grew into a time capsule of a year…
Yesterday Netflix released Deaf U, which is not your average reality TV show. Immediately, you get dropped into rich and vibrant Deaf culture, where hands fly with sign language, and a controversial social hierarchy exists based on one’s degree of cultural Deafness.
But like almost every other reality TV show, Deaf U hypes up what sells — sex, partying, and drama. There isn’t a minute in an actual classroom.
Some in the Deaf community hope that despite this angle, presenting these topics through a Deaf lens will help redefine mainstream society’s perceptions about deafness, by giving them something familiar, relatable…
In this post, we write Python code to generate fake news headlines, using a Markov chain model trained on a corpus of real headlines from The New York Times over the past year.
Some of these fake headlines:
I Used to Hold Hands?
‘We’re Going Down, Down, Down’
We All in This Picture? | Sept. 18, 2019
Zonked on Vicodin in the Presidential Race
Mike Pence Makes Clear There Is a New Constitution
Who Knew How to Clean Your Child’s DNA Information?
A.I. Is Learning That Liberals Eat Their Own Lawyers, Too
How New Yorkers Want Cheap Wine, and Lots…
Since its inception in 1991, arXiv, the main database for scientific preprints, has received almost 1.3 million submissions. All of this data can be useful in analysis, so we may want to be able to access the full-texts in bulk. This post goes over how we can do this using Python 3 and the MacOS X command line.
Although the data is sitting right there on the server, it is not recommended to crawl arXiv directly due to limited server capacity. …
Are you interested in using the popular Python library Matplotlib to analyze text messages or any other conversational medium that includes emojis? You may have noticed some difficulties in visualizing those emojis.
This post investigates why Matplotlib cannot plot emojis from the Apple Color Emoji font, and how we can overcome this lack of support to get the results we want.
An attempt to plot emojis:
Although we explicitly specified it as the font, these emojis are not Apple Color Emoji. …
Data scientist & computer science PhD student. I write about my fun projects, in addition to how-to guides that help you get data for your own fun projects!