Skip to main content

Posts

Showing posts with the label inconsistency

How to become a Data Analyst in 2023

Data analysis skills are one of the hottest skills that have been in high demand on the job market for the past few years. A "data analyst" job title is not new to the market, however, due to the growth of data generation and the facilitation of data storage provided by cloud computing, many companies have now the capabilities to store their big data and to derive insights and value from it. Data analysis has been and will stay a fundamental skill to have for most jobs. In the following, I will discuss how to start a career as a data analyst and how I was able to secure a job as a data analyst at a reputable company. Disclaimer Prepare yourself for the worse; learn more about that here . You should read it if You are looking for an internship or a junior opportunity as a Data Analyst. Data Analyst Trends A simple search of the term " Data Analyst " on google trends can show us a graph with a positive trend of the frequency of searches. We can observe that from 2

Data Inconsistency in Real Life.

Motivation... Many data analysts create functions or scripts to automate the cleaning of files that they believe come in the same format. However, human mistakes are pretty common when performing data entry. On my daily work I deal with "supposedly" identical datasets, however, the script would run for 2 to 3 files before throwing an error showing that a file is not in the same shape as the others.  Who should read it Any data practitioner, in general. Or anyone looking to enter the field sooner or later. Data Inconsistency In general The biggest frustration for data analysts and scientists is data inconsistency. It adds an extra layer of suffering to our daily work. Why? Because with inconsistent data, not only we have to transform the data into the shape we actually need, but we also have to clean all the mistakes done by others. In short, a clean data that would usually take 5-10mins to put into good shape for modelling or visualization (pivot, stack unstack...), now requi