Take a fresh look at your lifestyle.

Social Media Influencers Detection, Analysis & Recommendation Petr Podrouzek & Jan Rus-Socialbakers


- Advertisement -

In the last years, social media (mainly Instagram) influencers gained immense popularity as more and more brands understand their potential for marketing products and services. The challenge is to recommend influencer(s) for a particular marketing campaign. Companies might be interested in micro influencers for a certain industry/region and they would also like to understand which content works for the audience of a particular influencer.

After finding the right influencers they might want to start a marketing campaign with them and monitor the efficiency of such a campaign. In Socialbakers, we started tackling this problem in November 2018 when we piloted whether we can source Instagram’s business profiles and detect relevant attributes from those profiles. We processed large amounts of semi-structured data (1 TB) and tried to estimate demographic, geographic data and interests of each influencer.

After that, we built a smart search algorithm that would take into account these attributes as well as various metrics. Finally, we designed a machine learning based recommender that links content with the influencers based on their audience. This content would serve as an inspiration to the campaign manager. The data exploration and prototyping were done in Databricks (pyspark). Final optimized ETL that processes the data and persists results in S3 and Elasticsearch was also built in Databricks. The content recommended utilizes NLP and other ML approaches.

Mary Jane's Garden - Get Ready For Spring 250x250

- Advertisement -

We had no knowledge of Apache Spark before the project so had to onboard the technology during the project. In this paper, we would like to discuss two aspects of the influencers project. Firstly, it is the final influencer recommendation solution where we used Databricks for innovative research and large-scale data engineering including ML. Secondly, it is the challenges we faced while deploying Apache Spark from the scratch and onboarding the teams to our new platform.

About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unified-data-analytics-platform

Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/


Leave A Reply

Your email address will not be published.

x Logo: Shield Security
This Site Is Protected By
Shield Security