CUSTOMER SEGMENTS
- 0 Collaborators
A Dynamic project that illustrates the trade trends of the vendors based on the customers in the locality and focussing on the sort of products that they are more interested in by the application of UnSupervised Learning. ...learn more
Project status: Published/In Market
Overview / Usage
This project predominantly focusses on drawing smart inferences from the customer trends that are being formulated by the vendors locally for the business trading optimization by the application of UnSupervised Learning.
Since the entire data is looser and complex enough to draw patterns directly it involves a more of an UnSupervised Learning approach to practically derive the inferences from the data.
The entire project involves drawing patterns from the complex features of diverse customers and understanding the way of developing a trend of inferences that can be inevitably used by the vendors for clearly understanding the sort of items that the customers are willing to spend their money and facilitates the vendors to invest in the specific items that could eventually benefit the marketing strategies of the vendors locally.
Methodology / Approach
The Project is predominantly focussed on drawing essential patterns from the data that is more critically focussed on UnSupervised Learning since the data is more complex and more of an unlabelled state.
The data set is acquired from the UCI MACHINE LEARNING REPOSITORY and at the initial part of the project i.e., the DATA EXPLORATION phase an interactive visualization by the application of "matplotlib" and "seaborn" for dynamic visualizations illustrating the correlation of diverse features prevalent in the data set.
Next, an implementation of FEATURE RELEVANCE is performed by the application of cross-validation data set that is segregated from the input-data-set during training and testing phase of the project.
In the DATA PRE-PROCESSING phase of the project, the raw input data is being cleaned and maintained by the implementation of Python Scripting by the application of FEATURE SCALING and clearly visualized by a scatter matrix facilitated by the PANDAS module and the consequent part of the Pre-Processing phase the processed data is checked for the OUTLIER DETECTION which is achieved by TUKEY's METHOD FOR IDENTIFYING OUTLIERS by the performance metric INTER-QUARTILE RANGE.
It is more complex to draw inferences for an UnSupervised Learning model from a raw and well-processed data since it depends on the sort of algorithm that we pursue the further optimization of inferences. Hence we apply Principal Component Analysis (PCA) and a DIMENSIONALITY REDUCTION phase is implemented to reduce the dimensionality or simply the state space of the input data-set and a clean Bi-Plot is illustrated to describe the patterns in the input data space and finally, the CLUSTERING phase is performed to cluster the groups of items that the customers are willing to afford and it'll of a potential use for the vendors to visually see what sort of things that the customers are spending more money upon by the application of GAUSSIAN MIXTURE which dynamically segregates the input domain into clusters based on the characteristics of the data and a CLUSTER VISUALIZATION is implemented by the SEABORN library of Python.
The final end model is prerogatively applied to help the vendors to visually understand the patterns of the data that the customers are willing to spend more upon that ultimately supports the vendors to invest in certain products based on the user inferences that could eventually broadcast his optimization of marketing strategies.
Technologies Used
Artificial Intelligence
Machine Learning
UnSupervised Learning
Python
Pandas
Matplotlib
Seaborn
Clustering
Gaussian Mixture
Feature Relevance (Importance)
Data Exploration
Data Visualization
Feature Scaling
Dimensionality Reduction
Tukey's Method for Outlier Detection
Inter-Quartile Range
Feature Transformation (PCA)