Text exploration has always been my favourite process in Text Analytics. I always thrilled when I found something interesting. I have done two text analytics projects during my studies, which both I used Topic Modelling to study the topics that are discussed in the long text.
I never read the text before I do the analysis, and I should not be doing so as well because the text is extremely long, it is just not rational to read them to understand the topics. …
In Automate Excel with Python, the concepts of the Excel Object Model which contain Objects, Properties, Methods and Events are shared. The tricks to access the Objects, Properties, and Methods in Excel with Python
pywin32 library are also explained with examples.
Now, let us leverage the automation of Excel report with Pivot Table, one of the most wonderful functions in Excel!
You may curious why don’t we use
pandas library instead? It’s a built-in library that we don’t even need to install.
Well, the two
pandas functions mentioned above can create the Pivot Table easily, but…
Most of the time, an organization will have multiple data sources, data scientist or data analyst will have to extract and compile the data from the different data source into one Excel file before performing analysis or create the model. This could be a complex and time taking task if we do it manually.
Doing this with the most famous Python library,
pandas will shorten the time, but the hard-coded Excel file which might not be favor by other domain users who will access the Excel File directly. …
“Regular Expression (RegEx) is one of the unsung successes in standardization in computer science,” .
In the example of my previous article, the regular expression is used to clean up the noise and perform tokenization to the text. Well, what we can do with RegEx in Text Analytics is far more than that. In this article, I am sharing how to use RegEx to extract the sentences which contain any keyword in a defined list from the text data or corpus. …
The Internet has connected the world, while Social Media like Facebook, Twitter and Reddit provided the platform for people to express their opinions and feelings toward a topic. Then, the proliferation of smartphones increased the usage of these platforms directly. For instance, there are 96% or 2,240 million Facebook active users who used Facebook by smartphones and tablets .
The increment in the usage of Social Media has grown the size of text data, and boost the studies or researches in Natural Language Processing (NLP), for example, Information Retrieval and Sentiment Analysis. Most of the time, the documents or the…
Have you been in a position where you search over your mailbox to download all the attachments needed? Then maybe you leave and come back forgot where you have stopped? Perhaps you still have to save them to different directories or folder afterwards?
I have been in that position before. That is why I want to automate the process of downloading the attachment to the correct directory. After that, perform a transformation to the email’s attachment accordingly.
In this article, I would compare the possible Python libraries for the solution and share how I automate the process with Python.
In my previous article, I shared about AI Platform Notebook, which is a cloud computing service provided on Google Cloud Platform (Link to AI Platform Notebook Article). Before I found the AI Platform Notebook, I either use Google Colab or the GCP Compute Engine Virtual Machine Instances when I needed cloud service to run my Jupyter Notebook. Every service has its Pros and Cons. In this article, I will share my little thoughts and experiences when using these cloud computing services to run my projects.
Google Colab or other virtual machines provided by cloud computing services providers?
I believe you face the scenario before if you used the two type of services mentioned above.
This wonderful creation, GCP AI Platform Notebook is a Jupyter Notebook on Google Cloud Platform, with all the popular libraries…
Big data is trending. Smart devices, the Internet and technologies allowed the unlimited generation and transmission of data, and from the data, new information is gained. The big data generated are in various form, it can be structured, semi-structured or unstructured data. The traditional data processing techniques like Relational Database Management System (RDBMS) are no longer capable to store or process big data, as it has a wide variety, extremely large volume, and generated at a high speed. Here’s where Hadoop come into the loop. …
Natural Language Processing (NLP) is described as an application and research area that study how computers and learn and exploit natural language text or speech to create meaningful stuff . In order to achieve human-like language processing for a variety of tasks or applications, NLP as a theoretically inspired set of computational techniques for the analysis and representation of naturally occurring texts at one or more levels of linguistic analysis is necessary . The term NLP is typically used to describe the role of computer system components, software or hardware that analyses or synthesize spoken or written language .
Passionate in Data Science Path. Wish to share some of my works here =]