One of the best things about using Python is its infinity of open-source libraries. There is a library for basically anything. If a library can solve a problem, why not save your precious time and give it a try? Today, I will introduce you to 5 libraries that you probably have never heard about but you should add to your pipeline. Let’s get started!
When you start typing your code for a project, what is your first step? You probably import the libraries you will need, right? The problem is that you never know how many libraries you will need until you need it and get an error. That’s why PyForest is one of the handiest libraries that I know. PyForest can import the 40 most popular libraries to your notebook with one line of code. Forget about trying to remember how to call each library. PyForest can do that for you. I have written a whole blog about it, but in short, you install it, call it, and use it! All that in a few seconds. How about the aliases? Don’t worry about it. They will be imported with the aliases that we are familiar with.
How to use it
pip install pyforest and you are good to go. To import it to your notebook, type
from pyforest import * and you can start using your libraries. To check which libraries were imported, type
All the libraries above are good to use. Technically, they will only be imported if you use them. Otherwise, they will not. You can see libraries such as Pandas, Matplotlib, Seaborn, Tensorflow, Sklearn, NLTK, XGBoost, Plotly, Keras, Numpy, and many others.
I mostly use PyForest for my personal projects or projects that will not be reviewed by other people. If your code will be reviewed by other people, PyForest is not recommended for not making clear that these libraries are being imported.
Emot is a nice-to-have library that has the potential to improve by a lot your next NLP project. It transforms emojis and emoticons into descriptive information. For example, imagine that someone posted “I ❤️ Python” on Twitter. The person didn’t say the word love. Instead, they used an emoji. If you use this tweet in an NLP project, you will have to remove the emoji and lose a big piece of information. That’s when Emot comes in. It transforms emojis and emoticons into words. For those who are not familiar, emoticons are ways to express sentiments using characters. For example,
:) for a smiley face or
:( for a sad face.
How to use
To install it, you can type
pip install emot, and you are good to go. Then you will need to import it into your notebook by typing
import emot. You will need to decide if you want to figure out the meaning of emojis or emoticons. For emojis, the code is
emot.emoji(your_text). Let's check it out with an example:
You can see above that I added the sentence
I ❤️ Python 🙂 and used Emot to figure it out. It returned a dictionary with the values, the description, and the location. Like any dictionary, you can slice it and focus on the information that you need. If I type
ans['mean'], it will return only the emoji description.
I’m including Geemap to this list, but to be honest, it deserves an entire blog about it. In short, Geemap is a Python library that allows interactive mapping with Google Earth Engine. You are probably familiar with Google Earth and all its power, so why not use it for your next project? I’m planning to create a project to explore all its functionalities in the next few weeks. In the meantime, here is how you can install and start using it.
How to use it
You can install it by typing
pip install geemap in your Terminal. To import it to your notebook, you can type
import geemap. For demonstration purposes, I will create a folium-based interactive map using the following code:
import geemap.eefolium as geemap
Map = geemap.Map(center=[40,-100], zoom=4)
As I mentioned, I haven’t explored it as much as it deserves, but they have a complete GitHub README talking more about how it works and what it can do.
I learned about Dabl yesterday, and after doing some research, I found out that it deserves its own blog, but let’s cover the basics. Dabl aims to make machine learning modelling more accessible for beginners. For this reason, it uses low-code solutions for machine learning projects. Dabl simplifies data cleaning, creating visualizations, building baseline models, and explaining models. Let’s quickly review some of its functionalities.
How to use
First, to install it, you can just type
pip install dabl in your terminal. Then, you can import Dabl to your notebook by typing
import dabl. You are good to go from here. You can use
dabl.clean(data) to get information about features, such as if there is any useless features. It also shows continuous, categorical, and high-cardinality features.
You can use
dabl.plot(data) to generate visualizations about a specific feature:
And finally, you can create multiple models with one line of code using
dabl.Simplefier() just like you would do using Scikit-Learn. However, in this step, you will have to take some of the steps you would usually take, such as creating training and testing dataset, calling, fitting and predicting the model. Then, you can use Scikit-Learn to evaluate the model.
# Setting X and y variables
X, y = load_digits(return_X_y=True)# Splitting the dataset into train and test setsX_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)# Calling the model
sc = dabl.SimpleClassifier().fit(X_train, y_train)# Evaluating accuracy score
print(“Accuracy score”, sc.score(X_test, y_test))
As we can see, Dabl iterated through multiple models, including Dummy Classifier, GaussianNB, Decision Trees with different depths, and Logistic Regression. At the end, it shows the best model. All these models in about 10 seconds. Cool, right? I decided to test the final model using Scikit-Learn to make sure that this result was trustworthy. Here is the result:
I got 0.968 accuracy using the conventional way to predict and 0.971 with Dabl. That’s close enough for me! Note that I didn’t have to import the Logistic Regression model from the Scikit-Learn library because it was already imported with PyForest. I need to confess that I prefer LazyPredict, but Dabl is worth trying. There is much more to show about Dabl, and I will work on a blog exclusively for it with more details. Stay tuned!
Sweetviz is a low-code Python library that generates beautiful visualizations to kickstart your exploratory data analysis with two lines of code. The output is an interactive HTML file. Like other libraries that I mentioned today, SweetViz deserves its own blog, and I will publish one soon. For now, let’s get a high overview of it.
How to use it
my_report = sv.analyze(dataframe)
Did you see that? Sweetviz was able to create an EDA HTML file with information about the entire dataset and break it down so that you can analyze each feature individually. You can get the numerical and categorical association to other features, largest, smallest, and most frequent values. The visualization also changes depending on the data type. You can do so much more with Sweetviz, but I will keep it for another blog. In the meantime, I highly recommend you trying it out.
PyForest, Emot, Geemap, Dabl, and Sweetviz are libraries that deserve to be known because it turns complicated tasks into straightforward ones. If you use these libraries, you will save your precious time with tasks that matter.
I recommend you try them out and explore their functionalities that I didn’t mention here. If you do, let me know what you found out about them. Thank you for reading!