Welcome to “Learn with Eskwelabs!” This series is called “From the Notebook of Our Fellows” because you will be guided by our very own alumni through a mix of basic and advanced data science concepts. Every time you read from one of our Fellows’ notebooks, just imagine that you have a data BFF or lifelong learning friend who’ll hold your hand at every step.
Hi, everyone! My name is Basty, and I’m a data scientist and educator from Metro Manila. If there’s something I value so much, that is education. Education has allowed me to not only dive into the amazing world of code and data, but also to encourage and inspire others to do the same. Read more about me here.
Outside of work and school, I love playing video games like Valorant and League of Legends. I also love listening to Broadway musicals (HAMILTON, DEH, TICK TICK BOOM ALL THE WAY!). Lastly, I LOVE watching Friends, New Girl, HIMYM, and The Big Bang Theory.
Now, let’s take a look at my notebook!
Hey, there! So we’ve been learning about different machine learning algorithms and data scraping already, and it’s now time again to add a new skill to your arsenal of data science skills. In this blog, I’ll be helping you learn a widely and practically used concept/technique in the real-world, Time Series!
Before I go any further, I’ll be using the words forecast and predict interchangeably so hopefully you don’t get confused, and that I’ll be focusing mainly on the analysis, rather than the forecasting aspect of time series.
Time Series Analysis and Forecasting is a very prominent field in data science. Technically, it is the process of extracting information from time-series data to forecast and gain insights from it. In other words, it helps us analyze and predict the probability that something is gonna happen based on data with respect to change in time.
“Wait, so time-series data is different from the data we’ve been talking about from the previous blogs?” Well, yes! From the name itself, time-series data is basically a sequence or series of data points that involves a time component. And since we’re dealing with time, this kind of data is known to be non-static (change or motion) and continuous.
Earlier, I mentioned the words analysis and , but what is the difference between them when it comes to time series? Actually, the two are commonly used interchangeably when it comes to time series, but there is a very thin line between these two depending on your time series problem.
Basically, time series analysis is the study of patterns and trends to gain useful insights from time series data, while time series forecasting involves predicting future trends based on historical data. Hopefully, you now have a distinct understanding of the two.
As I’ve mentioned earlier, time series analysis is a prominent field in data science, and it is widely used in the real-world. It is used in healthcare analytics, geospatial analysis, and weather forecasting! Here are some industries that time series analysis is used on:
You might also be wondering about the difference of regression and time series because they both similarly work the same way.
They both have continuous target variables, and both also do the process of predicting future outcomes, so what’s the difference?
A regression analysis is commonly good for simple relationships such as predicting the age of a person based on their height or the GPA of a student based on the amount of time they study. However, if we’re talking about the relationship over time so that we can identify patterns and trends, then that is where we use time series analysis.
Any time series problem can be broken down into several components, which can be very useful for analysis and forecasting. These various components can help us highlight the trend and behavior of the data over time. But hold on! Before we look into the different components, it is worth knowing there are 2 integrants that you should be aware of:
For the components that we use to break down time series data, they are:
We’ve been focusing a lot on the theoretical side of time series analysis, and so it’s time to actually do a simple analysis ourselves!
Before proceeding, make sure to download the dataset from here: time series dataset
Read the dataset
Cleaning the dataset
Our dataset has 5 columns and 96 rows. The columns are:
Plotting the line chart for all columns
This plot contains all the data from all 5 columns so we can’t really get an exact view, so let’s try to focus on the time series of revenue from 2015 to 2020 by dropping all the other columns.
Now we only have the Period and Revenue columns. Let us now plot the graph!
In this time series graph, we can see that there is an increasing trend for the company’s revenue from 2015 to 2020.
Congratulations on reaching the end of this blog! We learned some pretty interesting concepts about time series, my friend (you are now a time lord—just kidding!).
To summarize everything we’ve talked about from this blog:
I hope you had fun exploring Time Series, and that you continue to learn more about it! If you’re interested in learning more about time series analysis and forecasting and you want to apply your newfound knowledge, join me in my next blog where we’ll be conducting time series analysis on Manila’s rising sea levels data!
If you’re more curious to learn more about this topic, did you know that in the 12-week Data Science Fellowship, you’ll be applying Time Series analysis on the music industry? If you’re as into music as I am, I encourage you to join the bootcamp where you’ll have the chance to analyze your favorite artist’s Spotify streams! Whoever your favorite artist is, or whatever genre you listen to, there is an application of time series that you can utilize in the bootcamp!
I hope you found this blog useful, and that you stick around for the next one where we’ll be applying it to a real project!
Never stop learning!
Updated for Data Science Fellowship Cohort 10 | Classes for Cohort 10 start on September 12, 2022.
If you’re ready to dive in
If you want to know more
Bootcamp payment options
Other Bootcamp features