Pluto7 Blog

What is Machine Learning in Big Data, really?

Posted by Manohar Powar on Jan 6, 2017 5:10:30 AM

What is machine learning in Big Data?

Are computer systems that learn from data. It is not explicitly programmed to handle a task. The amount of data and quality of data determines the  ability or performance of the machine learning model to handle the given task.

To summarize machine learning systems -

  • Learn from data
  • no explicit programming
  • discovering hidden patterns
  • data driven decisions

Machine learning is an interdisciplinary field:

Machine Learing Interdisciplinary field.png

Machine Learning applications -

Machine learning is being used in everyday life like Google Search Engine,  Credit Card Fraud Detection,  Product Recommendations on E-Commerce sites, Sentiment Analysis,  Crime Pattern Detection, Climate Monitoring, Drug Effectiveness Analysis etc.

Machine learning algorithms are used in Data Mining, Data Science and Predictive Analytics for solving problems.

Categories of Machine Learning -

  • Classification - predicting category of the input. e.g. Predicting weather as sunny, rainy etc based on Weather data collected through sensors
  • Regression - Predictive numeric value ( Continuous data attribute ). e.g. Predicting the price of the house for a city.
  • Cluster Analysis - Organize similar items into groups. e.g. Customer Segment Analysis by age, location
  • Association Analysis - Find set of rules of association between events or items. e.g. also called  Market basket analysis used in cross selling

Machine learning as also categorized as -

  • Supervised Learning - Target ( labelled data ) is provided.
  • Unsupervised Learning -   Target is unknown or unavailable.

Machine Learning Process:

Problem or opportunity must be clearly defined with Goals and objective while executing implementation using Machine Learning. Following is high level flow which consists of

Screenshot 2016-12-20 11.57.50.png

  • Acquire - Identification of Business Data, Sourcing Data
  • Preparing Data - preliminary analysis of data for characteristics, format and quality of data, Pre-processing data - clean, select and transformations
  • Analyze - identification of Machine Learning Model, Training Model, Evaluation and testing of the Model, Identifying iteratively suitable model for getting accurate results
  • Report -  Generating Visualizations and reporting results
  • Act - Applying results or inferences from previous step.

How to go about implementing Machine Learning -

Machine learning implementation requires extensive experience of the business, big data technologies and compute intensive infrastructure.  Most of cloud service providers now offer Machine Learning as cloud service which makes technology accessible to broader range of people at affordable price. e.g. Google, Azure, Amazon and IBM provide software stack for implementing Machine learning on the cloud.

Machine Learning in Solving Supply Chain Problems -

Pluto7 brings in cross domain expertise in solving problems in the field of Supply Chain leveraging  Google Cloud Platform products like - Google Cloud Storage, Google Cloud Dataflow, Google Cloud BigQuery and Google Cloud Machine Learning as service.

Google cloud - Machine Learning Architecture.png
Recommender systems, one of the application of ML,  have become extremely common in recent years. It is utilized in a variety of areas: some popular applications include movies, music, news, books, research articles, search queries, social tags, and products in general. Pluto7 brings in expertise in building Recommendation Engines using various algorithms to solve problems  in managing supply chain helping businesses to improve the  business performance.

How Recommendation Engine works -

Recommendation  system needs to be able to learn from your users, collecting data about their tastes and preferences. Over time and with enough data, machine learning algorithms improve to do  useful analysis and deliver meaningful recommendations.

A core component of building a recommendation engine is filtering. The most common approaches include

Content based filtering - recommended product has similar attributes to what use views on likes

Cluster Filtering - Recommended products go well together and no matter what other users have done.

Collaborative filtering -  other users,  who like the same products the user views or likes, also liked a recommended  product.

Collaborative filtering  is comprehensive algorithm and enables you to make product attributes abstract and make predictions based on users tasters. The output of this filtering is based on the assumption that two different users who liked the same products in the past will probably like the same ones now.

Machine learning will power Supply Chain Solutions  and lead business transformation in coming days. Along  with the cheap cloud computing infrastructure will see wide implementation.

Would you like to listen to a recent recorded webinar on Supply Chain applications for Machine Learning?  

Play recorded webinar

 

 

Topics: Machine Learning