Submit Your Requirement

Download Web Data Acquisition Framework

Did you know that there are 12 factors to be considered while acquiring data from the web? If no, fret not! Download our free guide on web data acquisition to get started!

Scroll down to discover

Data Analysis Reveals the Spookiest City in the US

October 31, 2018Category : Blog
Data Analysis Reveals the Spookiest City in the US

Trick or treat?

Yes, the time has come for all to dress up in Halloween costume, go trick-or-treating, and sit by the fireplace to discuss the horrific ghost stories that people have encountered since their childhood. That said, did you know that in the US, Halloween is the second highest commercial holiday in which total expenditure goes up to $9 billion? So, considering the love between the US and Halloween, it’d be interesting to dig deep and find out the spooky elements of the country. We’ll find out if your city falls in the list of haunted places and whether you should be a bit extra careful this Halloween.

For this study, we extracted data from a website called Shadow Lands (feeling spooked yet?) to build the data set.  Not only does it list haunted locations in the US, but also mentions the history behind each place. Visitors of the site have an option to add their own haunted place in case it is missing! Using the data from the website, several data fields related to each location was captured. Here is the list:

  • location 
  • city
  • state
  • description text
  • latitude and longitude of the city


Here is what we’re unraveling from the analysis:

  • Top 30 haunted cities
  • States based the number of haunted places
  • Heatmap of haunted places across the US
  • Frequently used words in the description
  • Underlying connections of the words

The most haunted city

The chart shows the top thirty cities according to the number of haunted places. We see that Los Angeles, San Antonio and Honolulu are at the top spots when it comes to haunted places. 

It would be interesting to notice that Los Angeles has spooky locations with descriptions referring to the Hollywood twenty five times and Universal Studios twice. And the following locations in LA are also prevalent:

  • Boyle Heights
  • Loyola Marymount University
  • Occidental College

Be careful in these areas!

The most haunted state

Of all the states, California tops (not that it is a surprise), but it is closely followed by Texas and Pennsylvania. In case you’d rather stay in a less spooky city with lower number of “incidents”, I would recommend Montana, Delaware and Alaska since they are the least haunted states.

Number of haunted places by state in US

Heatmap of the haunted places

The charts for haunted cities and states give a fair idea, but is there another way to visualize how the haunted places are spread across the US? That’s when a heatmap comes into play to give an idea of the density of the locations.

Clearly, the East Coast has denser clusters of haunted places in comparison to the West Coast (only epicenters like LA, San Francisco, and Seattle contribute to spookiness here). Apart from that we see the Southern US is more haunted than the Northwestern US.

Frequently used words

Now, we will look at the most frequently used words in the description text of the data set. The following word cloud shows the top 300 terms:

Frequently used words

It shows that words such as ‘night’, ‘people’, ‘old’, ‘see’, ‘house’, ‘ghost’, ‘room’, ‘building’, ‘room’, etc. are prevalent. Some of the interesting findings are the following:

  • It seems the chance of encountering mysterious beings is higher in house, building and road in comparison to cemetery.
  • The cumulative word count for female (women/girls/lady) is more than male (men/boys). Word count for ‘old’ is higher than ‘young’.
  • Vampires outnumber werewolves in terms of word count.

Relationship between words

Although we figured out the frequently used words, it’d be much more insightful if we could find out the relationship between the words used in the description text. Here we will focus on bi-grams (a pair of consecutive written units) and visualize the relationship via network graph.

Network graph of bi-grams

This network graph shows some interesting connections. For instance, there is a cluster of words related to soldiers and civil war which means some of the haunted places have emerged from the death and destruction caused by civil war. The larger cluster at the bottom associates ghost with haunt, hunters and stories (which makes sense). We also see words such as shadowy, ghostly and dark are associated with figures, which is connected to walking. Check out how the word poltergeist (noisy ghost) is associated with paranormal activity! This is mostly because of the nature of the poltergeists — they are known to levitate objects and horrify people by pinching, hitting and tripping humans.

So, that was some fun use of analytics and data sourcing via web scraping. Now it’s time for you to carve a pumpkin and impress people by capturing the talking-points for Halloween party.

Have a spooktacular Halloween and may the holy ghost bless you!

Web Scraping Service CTA

Leave a Reply

Your email address will not be published. Required fields are marked *

Generic selectors
Exact matches only
Search in title
Search in content
Filter by Categories
eCommerce and Retail
Real Estate
Research and Consulting
Web Scraping

Get The Latest Updates

© Promptcloud 2009-2020 / All rights reserved.
To top