Contributors
Project Overview:
The Main Objective :
Since there is a lack of open datasets in the telecom domain, we don't obtain large datasets especially for performing machine learning projects. Hence we are aiming to deliver synthetic data to the virtual domain so that the researchers and also telecom companies can use this synthetic data for their experiments.
Why there's a need to Generate Synthetic Data?
In the world of information technology, companies use data to improve the customer experience and provide better services to their customers. Sometimes, the collection of data can be tedious and costly.
Currently, GANs are popularly applied to generate image data, but very few works are done on the tabular data. One of the reasons is non-image synthetic data is difficult to evaluate quality. In this work, we are trying to generate synthetic data from scratch.
GANs:
GANs stands for Generative Adversarial Networks. “GANS” well it might sound complex but actually it's not. Ian Good Fellow et al. published “Generative Adversarial Networks” in 2014, which was the first study to describe GANs. Generative adversarial networks (GANs) are an exciting recent innovation in machine learning. GANs are generative models: they create new data instances that resemble your training data.
The GAN architecture consists of two components called Generator and Discriminator. The role of the generator is to generate new data (numbers, images, etc.) which is as close/similar to the dataset that is provided as input, and the role of the discriminator is to differ between generated data and real input data. Since then, GANs have seen a lot of attention given that they are perhaps one of the most effective techniques for generating synthetic data.
Approach:
In order to generate synthetic data, we are considering deep learning techniques such as GANs.