Experience Eskwelabs: What are the Capstone Projects of the Cohort 9 Fellows?
Date
Reading Time
 minutes
Read if
Tags

Experience Eskwelabs: What are the Capstone Projects of the Cohort 9 Fellows?

Breaking down what each of Cohort 9's capstone projects mean.

Hi, everyone! My name is Basty, and I’m a data scientist and educator from Metro Manila. If there’s something I value so much, that is education. Education has allowed me to not only dive into the amazing world of code and data, but also to encourage and inspire others to do the same. Read more about me here.

Welcome back to our Demo Day 2-part series, wherein we first talked about what Demo Day is and why it is the culminating event of the Data Science Fellowship in Eskwelabs. In this second part of the series, as promised we’ll be going through each capstone project prepared by the Fellows (students).

What are Capstone Projects?

This is the final project produced in the 15-week Data Science Fellowship by the Fellows. It is the culmination of all that they've learned and worked hard for during the Fellowship. The counterpart of this in the Data Analytics Bootcamp is the Company Business Review.

Now let's try to understand how they used data science to solve their chosen problems. Without further ado, let’s gooo!

Group 1 - Cabanatuan

Group 1 is composed of Fellows Jacob, Jota, Jopet, Ron, and Gelo, and their project is titled Road to Zero Poverty: A machine learning approach to alleviating poverty in Cabanatuan City.

In this capstone project, they’ve decided to take on the problem of poverty in our country, specifically in Cabanatuan City. Poverty in Cabanatuan has resulted in the neglect and marginalization of its people. In fact, Nueva Ecija, the province that Cabanatuan City is in, was not recognized by Spain as a separate country only because of poverty.

In Cabanatuan City, it is known that 26.14 million people live below the poverty line with a Php 12,000 salary a month in a family of 5. Aside from the problem of income, the team has also highlighted that another reason why poverty is such a big problem in the city is because of its multi-dimensionality, such as access to water and sanitation, education, etc. And so with all of these in mind, the group came up with a question, "What data-driven solutions can we provide to Cabanatuan City in alleviating poverty?"

In order for them to answer their question, they’ve identified 3 objectives, namely:

  • To uncover the factors affecting poverty at household level
  • To quantify the factors with the most impact in poverty at barangay level
  • To strategically allocate the solution throughout Cabanatuan City

To achieve these objectives, they used the Community-Based Monitoring System (CBMS) 2018 Cabanatuan dataset, which featured poverty indicators on health, nutrition, housing, water, education, income, employment, and peace & order.

For the first objective, they built a classification model with income as the main factor to classify whether a household is poor or non-poor. Through this model, they were able to uncover the different factors affecting poverty in a household.

In terms of the barangay level, which is the second objective, they used linear regression to quantify the factors with most impact in poverty.

Lastly, for the third objective they used clustering to strategically allocate the solutions they came up with.

Group 2 - eCFulfill Inc

Group 2 is composed of Fellows Aleta, Geniston, Lacar, Laurel, and Perillo, and their project is Serving Philippines delicacies on the Global Menu: Making local products competitive on the global platform Amazon in collaboration with eCFulfill.

eCFulfill helps make Philippines MSME products available all over the world. However, the problem is that there is a lack of knowledge in terms of setting up and managing their products to be globally competitive. And so, the group has identified their data science problem to be, “How can we make our Philippine products Amazon Best Sellers?”

In order for them to answer this question, they came up with two objectives:

  • To identify important product listing features
  • To surface best seller practices.

To accomplish these objectives let’s take a look at their project overview:

They first gathered data through the Amazon website with the use of web scraping, then they did some preprocessing to turn the data into a usable format, and they created a binary classification model. Based on their initial findings, they then further explored the data through the use of Natural Language Processing (NLP).

Here are some interesting insights that they’ve got from their analysis:

  • Bestsellers have longer product titles and descriptions than non-bestsellers in all categories
  • Bestsellers have more images set-up in their listings
  • Across all categories, bestsellers have more answered questions than non-bestsellers

Group 3 - BooCA

Group 3 is composed of Fellows Anjelo, Tin, and Tan, with a project entitled BooCA, A Machine Learning Solution to deal with Booking Cancellations for Hotel Lita.

In this project, they decided to focus on hotel booking cancellations since it deals with lost revenue opportunity, and problems in staffing, supplies purchases, and profitability. To counter these, hotels implement policies to avoid cancellation and force cancellations.

In the group’s case, their client, Hotel Lita, needs an approach to address these cancellations by answering these questions:

  • Which booking will likely be cancelled?
  • What could be the reasons?
  • How to approach the cancellation?

With the questions already put in place, the group came up with a solution, which is to predict the probability of each booking to be cancelled through machine learning.

With this approach, the group will able to:

  • Help produce better forecasts and reduce uncertainty in management decisions
  • Help in revenue management such as inventory allocation, pricing decision, and other management context
  • Enable Hotel Lita to act upon those specific booking to avoid their cancellations or force it
  • Help in the development of booking and cancellation policies

To do these, the group collected reservations data from 30 different hotels worldwide. And with that data, they were able to make a Catboost model, a model used for classifying categorical variables, and also identify features that are attributing to hotel cancellations.

Those features are namely: the number of changes in booking, lead time, and average daily rate in USD.

Group 4 - Team Dunkin

Group 4 is composed of the Fellow Moreno brothers, Juancho and Niño, together with Fellow Kyle. This group decided to take on a different route compared to the other groups as they went with sports analytics. Their capstone project is titled, PLAYER ARCHETYPES, What types of PBA and NBA players are out there?

The basketball industry is a huge business, especially in the Philippines. The NBA alone is estimated to be worth $49.5 billion alone. In the Philippines, it’s estimated that nearly 40 million Filipinos play or have played the sport. On top of that, the Philippines will also be hosting the FIBA Basketball World Cup in 2023, which makes their project relevant and timely.

With all these in mind, the group came up with two problem statements:

  • How can data science be used to better inform PBA stakeholders?
  • What types of PBA and NBA archetypes are present in the league?

By answering these problems, the group will be able to identify different types of players in the NBA and PBA, compare their similarities and differences, and provide meaningful insights for PBA stakeholders.

In order to achieve these, the group collected data through basketball-reference.com and dribblemedia.com. The data consists of different quantitative stats such as points per game, rebounds per game, etc.

After scraping the data from its sources, they performed some data cleaning to make sure the data is in proper format. Next is to model this data by using K-Means Clustering and Soft K-Means. And with the results from the models used, they came up with player archetypes that will later on become recommendations for PBA stakeholders.

Here are all the different archetypes that the group came up with for the NBA:

  • Role Players - mediocre support players
  • Shooters - excellent 3-point shooters
  • Contract Players - borderline G League players
  • Supporting Big Men - not as good as the main big men
  • Main Big Men - dominate the paint
  • Stars - main core of a team

As for the PBA, here are the different archetypes they’ve established:

  • Supporting Bigs - mediocre but necessary
  • Shot-takers - prone to taking more longer range shots
  • 2 way Big Man - using size to their advantage for both offense and defense
  • Role Players - just like supporting bigs, but focuses on assists
  • Limited Players - limited playing time
  • Star Guards - leads the team with points and assists

Culmination

Wasn’t it amazing to witness all the different capstone projects? The final product, the conclusion, and the culmination of their entire bootcamp experience into one project!

It was inspiring to see how all four groups were able to apply data science to real-world cases and different industries, and of course let’s not forget, to create impact with the use of data!

This blog post is only a glimpse of what YOU can also experience and achieve if you join the Data Science Fellowship.

Harness the power of data and use it to create impact in fields and industries you are passionate about! Who knows, your capstone project might be the next one featured.

Never stop learning!

From the scrapbook of Basty Vergara | Connect with Basty via LinkedIn and Notion

RECOMMENDED NEXT STEPS

Updated for Data Science Fellowship Cohort 10 | Classes for Cohort 10 start on September 12, 2022.

  • If you’re ready to dive in
  • Enroll in the Data Science Fellowship via the sign up link here and take the assessment exam.
  • Note: The assessment exam is a key part of your application. The deadline for the assessment is on August 21, 2022.
  • If you want to know more
  • Read a more detailed guide on the Fellowship.
  • Book a 15-minute consultation with our Eskwelabs’ Admissions team.

YOUR NEXT READ