Hi, everyone! My name is Basty, and I’m a data scientist and educator from Metro Manila. If there’s something I value so much, that is education. Education has allowed me to not only dive into the amazing world of code and data, but also to encourage and inspire others to do the same. Read more about me here.
Outside of work and school, I love playing video games like Valorant and League of Legends. I also love listening to Broadway musicals (HAMILTON, DEH, TICK TICK BOOM ALL THE WAY!). Lastly, I LOVE watching Friends, New Girl, HIMYM, and The Big Bang Theory.
Now, let’s take a look at my notebook!
Were you able to catch Demo Fest last August 10? Well, don’t worry if you weren’t able to because I’m here to walk you through the two different capstone projects that were presented on behalf of the Data Science Fellowship.
In this blog post, we’ll explore simulations of how power outages affect the lives of our fellow kababayans in Visayas through Network Analytics. Aside from that, we’ll also talk about how data science and machine learning can be a tool to empower the gaming industry by predicting the success of a game using player retention. Let’s get started!
Visayas Transmission Simulator, also known as VTS, is one of the two capstone projects that were showcased during Demo Fest, and it was made by fellows’ Kurt, Gelo, Pau, and Jed, together with their mentor JC. The project focuses mainly on simulating the effect of line outages in the power grid of Visayas through Network Analytics.
Power Crisis in Visayas has been a trend this year. Due to the power crisis, this has also caused a delay in public information, with approximately 1.5 hrs of delay in social media and 7 hrs of delay in news outlets.
This study aims to improve outage understanding and enhance warning systems by utilizing live alerts data and examining the network structure of the Visayas grid.
The team have identified 4 specific scopes and limitations for the project:
Here are what we know about the Visayas Grid:
75% of all advisories in Visayas refer to line outages, while line trippings and restoration in Visayas exhibit a random occurrence pattern, suggesting a lack of predictable factors or consistent triggers.
Here is an image of the Visayas Grid Single Line Diagram (SLD), which was extracted from the Market Network Model (MNM) as of April 2023. This diagram is composed of 5 minor grids: Cebu, Leyte-Samar, Bohol, Negros & Panay, and will be used as the basis of the network and for identifying affected nodes.
After conducting EDA, here are the results of the Network Analysis through different centrality measures.
In terms of further improving the project, the team recommends:
For the development of the power grid:
Leveling up with data science: Player retention in the gaming realm was the second project showcased during Demo Fest, and the team consists of Jamie, Harold, Ace, Gian, and Patrick. The project focuses on utilizing a machine learning model for predicting the success of a game through player retention.
The Philippine gaming industry earned 78 Trillion PHP back in 2020 and is expected to grow to 121 Trillion PHP by 2027. However, while the PH Gaming Industry is highly valued, local game developers struggle to succeed as stakeholders within the ecosystem.
According to the flowchart, a player can either finish a game or drop it even before finishing it. And so, the team decided to focus on player retention to assess the likeliness that a game would succeed.
Since the nature of the project is classification, the team first looked at the correlation of the features to ensure that there is no multicollinearity and that they were using features that had a relationship with the target variable.
The dataset was scraped from both RAWG API and Steam API, which had a total of 30,251 total number of games, then was later on cut-down to 13,280 after pre-processing.
The team ultimately decided to use a Gradient Boosting Algorithm to model the data. They also utilized SHAP, a game theoretic package that is used to explain the output of any machine learning package, and found that the top features that contributed the most to the results are “to_play”, “no_of_reviews'', “dropped”, and “playtime”. Higher values of these features, except for “dropped”, all contributed to the positive prediction of the model.
In summary, organizations such as The Game Developers Association of the Phillipines (GDAP) should focus on four main features based on the model: to_play, review_count, dropped, and playtime.
Another helpful tactic is to establish an online presence by creating social media pages or press releases for their games. Lastly, developers should also consistently engage in the community to figure out what gamers actually want.
For further improving the analysis:
1. Conduct a time series analysis of player data such as:
2. Use NLP for social media impact and coverage
With the two projects showcased during Demo Fest, they were all able to demonstrate their remarkable potentials in this field by creating projects in two vastly different domains. Through the VTS project, the team increased social awareness to power outages in Visayas through network analytics.
The Player Retention in the Gaming Realm project showcased that while the gaming industry still has a long journey towards data maturity, there are already numerous ways we can utilize data in order to improve the gaming experience.
As data scientists, these projects have truly sparked our curiosity, pushing us to explore new possibilities and leverage the power of data for a better world. The fellows’ journey in Eskwelabs was about merging the brilliance of data science with real human needs, aiming to create a tangible impact in people's lives and communities. To all our remarkable Fellows, congratulations and see you in the industry!