Data science:
Data has classified as three types:
Unstructured data: the data is collected in a random way such as social media data, Phone calls, Weather.
Structured data: It is collected a data in a structural and specific data.
DB and APP data: Customer and invoice data.
ML has a four life cycles:
Build, Deploy, Train, Manage
Jupyter Lab:
* It is a webbased interface for notebook sessions
* Ability to work with integrate documents and activity including:
* Jupyter Notebook
* Text editors
* Terminals
Features of Jupyter Lab:
Menu Bar : Top level menu that expose actions available in JupyterLab
Launcher: Provide easy access to your notbooks, console, text editor and environment explorer
Left Sidebar: File browser and command Palette
Conda environment:
It is an Open source and environment management system.
Features of Conda Environment:
* Install and updated their dependencies
* Maintain a different software group
* Change over between environments
* Develop a netbook and deploy a modules
Data Science use of Conda environment:
Provide a below specific framework and build a module:
*PyTorch
*TensorFlow
General Machine learning Algorithm:
use case:
* Data manipulation
* Supervised machine learning
* AutoML functionality
* Machine learning explainability
Top Libraries used for GML:
* category encoders
* lightgbm
* scikit-learn
* TesorFlow
Natural Language Process [NLP]
Use case:
* Text extraction
* Part of speech tagging
* Key pharse extraction
Top Libraries of NLP:
* nltk
* transformers
* eli5
* Lime
ONNX:
Opensource software for the Data science.
Use cases:
* Portability and interoperability between ML frameworks
* ONNX runtime liberary allows us to run a module on different platforms.
Top Libraries of ONNX:
* onnx
* onnxconverter-common
* onnxmltools
* onnxruntime
PyTorch:
It is an opensource of machine learning framework. It is used mulitple of deep learning algorithm.
Use cases:
* Computer vision, NLP and general machine learning
* Deep neural networks and algorithms for deep learning.
Top Libraries of PyTorch:
* Panda
* daal4py
TensorFlow:
Use cases:
* Machine learning
* Deep neural networks
* Flexible architecture run on CPUs, GPUs and TPUs
Top Libraries of TensorFlow:
* Panda
* scikit-learn
* Tensorboard
* TensorFlow