Yesterday I learned that the first step of mastering statistics is to master the art of exploring data.
Exploring Data Types
Categorical data is data that can be in groups. They are labels. In the R programming language, they are called factors. Generally categorical data you will use a bar chart or pie chart to explore data. The distribution of categorical data are counts, frequency, or percentage.
For quantitative data, you would use a histogram, line chart, or stem plot ( only if the data is small).
Exploratory Data Analysis (EDA) workflow
For the 66 days of data science challenge, I am going back to the basics and relearning statistics. I am using a textbook titled Introduction to The Practice of Statistics Sixth Edition — Moore, McCabe Craig. Here is what I learned so far.
Data are numbers with context. Before you do any statistical calculation and create data visualizations you need to start with the habit of forming a question. “What does the data tell me?”
The starting point to any statistical analysis is to master the art of examining data.
|Person | Age | Weight
| — -| — -|…
At the beginning of the year, I stumbled upon the hashtag #66DaysofData on Twitter. After doing some investigation it's the same concept as #100DaysofCode but a couple of days shorter. The goal of 66 Days of Data is to make learning about data science a habit. How long you study each day is up to you. But the minimum time limit for the challenge is 5 minutes.
I attempted #100DaysofCode many times. (6 times but whose counting) The longest streak I have gotten is 21 days. …
This week I got the privilege to attend visFest this week which was held in Chicago. It was a surreal experience getting to meet data visualization influencers such as Mike Bostock and Shirley Wu. I also getting to meet people that I talk to in the D3 Slack group was also unreal.
Going to visFest gave me a new perspective on what data visualizations are and how to approach creating them.
My approach to data visualizations where if you have data that deals with time create a line chart. If you have data that is categorical create a bar chart…
If you are new to coding then Congratulations on your journey! The road might be tough but good things never come easy.
What is a Code Editor?
A code editor is like a text editor instead of editing text you are editing code. Most text editors will tools like linters that will highlight any syntax errors in your code. Code editors can even auto-suggests lines of code to help you type quickly.
What is an IDE?
There will be a day when you are working on a real world data set that is not formatted properly. What do you do in this situation?
a. Panic, because you never had to deal with untidy data during your school studies.
b. Rage quit because you tried to clean the data either manually or using Excel tools like Find and Replace.
c. Use regular expressions to help clean your data.
What are Regular expressions
Regular expressions also know as Regex is a way to find patterns in text. Let’s say that you are reading a pdf…
I’m pretty bad at technical questions in software interviews. When I came across one I would start to panic. Then I would lose faith in my abilities as a programmer and it would set the tone for the rest of the interview.
One day I just had enough and decided to get better at technical question. So I did what most people do in my situation and went on Hacker Rank. I told myself I was going to conquer these technical questions.
I practiced for about three days before I never opened up the site again.
I am in the second month of my mentorship with the Chicago Python (ChiPy) Group. My capstone project will be an exploratory data analysis visualization dashboard superhero characters from the data set found on Kaggle. It will be using D3.js for the visualizations and Flask framework for the backend.
So far this month I have been learning the basics of Python and Flask. Flask is a microframework which has been used in applications such as LinkedIn and Pinterest. The hardest part about learning Flask is setting it up with my Windows machine. It took about a week for me to…
The beginning of this week began with me doing a retrospective with my mentor. A retrospective is when you review what went well with your project and what went wrong during the week so you can make minor improvements.
I needed help with managing my time. My mentor suggested that I read David Allen’s Getting Things Done also know as GTD. GTD is a productivity system for people who feel overwhelmed with work tasks and need a way to organize things. I attempted to read it but it was too dry; I kept falling asleep.
During the first week of my ChiPy Mentorship program, my mentor suggested that I should brush up on some Python foundations. I learned some Python during my time at college but just enough to get through homework assignments and exams. But I never had a deep understanding of the Python language because I don’t have a deep understanding of Python’s data structures.
In order to program in any language you need to know two things, they are data types and data structures. A data type is how information is represented in a programming language. The most common data types between…