Tag Archives: big data

Learn How to Analyze Big Data for Free

Intro Databases and Statistical Learning“What Should I Study?”

Right before I started my first full-time job, I had a good talk with the chairman of my university’s supply chain department. I asked him, “If I were going to study more after I graduate, what do you suggest I focus on?” Without a second of hesitation, he responded, “databases and statistics.”

Really? Databases and statistics?

I had taken the required introductory statistics class. Then, I promptly forgot everything that Excel couldn’t automatically do for me.

I had also done a few queries in Microsoft Access, but anything more on databases was taught over at the computer engineering school – not the business school.

As I entered the workforce, I read up on many other topics related to my field: books on negotiations, communication, and network optimization. I received my APICS certification. Logistics and purchasing trade magazines covered my desk.

I eventually taught myself SQL through Sams Teach Yourself SQL in 10 Minutes so that I could better query our company’s databases, but that seemed like more than enough database knowledge for what I needed to do.

However, after a few years, I kept running into the term “big data.” Although there are many definitions for big data, I like to think of it as ‘more data than an Excel spreadsheet can handle’. What’s cool about big data, and the reason it’s gathered so much attention, is that you can find trends and patterns that were impossible to uncover before companies started collecting so much information. Faster computers also allow you to analyze that information without waiting weeks for your calculations to process.

All of this is extremely important to supply chains and company operations. Millions of dollars are just waiting to be saved if you can uncover better trends and patterns, especially in forecasts.

The surprise for me, however was that to be a big data ninja (or analyst, if you want to use the boring job title), you need fairly decent skills in two areas. Would you like to guess what those two skills are?

Databases and statistics.

Learn Databases and Statistics through Free Online Courses

Well here’s the best part of this article – you can start mastering both of those topics for free.

Stanford University is offering free online classes on both of those topics right now. The process is easy and straightforward:

  • Professors lecture, explaining concepts and examples through online videos
  • You read the free course materials and/or books
  • You work through examples with free open-source software
  • You take online quizzes and talk with other students.
  • You learn some awesome big data skills and even get a certificate of completion from Stanford when you finish

To enroll, you just register at the class websites: Introduction to Databases and Statistical Learning.

I’m taking the statistical learning class right now and I’m really enjoying it. A friend of mine finishing his MBA program let me know about the statistical learning class. His professor suggested that he may want to take it as he heads off to work for UPS.

The classes started about a month ago, but there’s no problem starting late and jumping in now. Plus – there’s no risk at all – so if you start and realize it’s not the thing for you, then oh well, no worries.

Why I’m Studying Statistical Learning

Databases make sense if you’re interested in getting into big data. However, statistics seems a little more intimidating to me. Here’s the reason I’m taking the statistical learning class.
Before the class, I knew how to use the trendline function in Excel to find the relationship between two variables. I could easily figure out if there’s a correlation between sales and the money spent on TV advertising.

However, now I’m learning how to find the correlation between multiple variables, such as sales and combined advertising in TV, radio, and newspaper. I’m in the middle of chapter 3, where I learned what method to use to figure out whether a variable actually relates, or if other variables are responsible for the change. For example, shark attacks and ice cream sales both go up in the summer, but ice cream isn’t causing shark attacks. Similarly, in the advertising data I’m working with, newspaper advertising appears at first glance to have an effect on sales. However, when we look at all the variables together, newspaper advertising doesn’t affect sales at all – only TV and radio advertising do.

Now that’s an awesome observation if you work in marketing, but how will that help our supply chain? I plan to approach our forecasts much differently after understanding these statistical analysis techniques. If I could find relationships on dimensions such as date, location, promotion, price, and other variables, then I could get much better forecasts and hold less inventory. Even if I could improve our forecasts just a little, that’d more than make my time in a free class worth it.

The class teaches you how to use an open-source program called R, which many companies are beginning to use such as Google, P&G, and Ford. If they’re using it, and it’s a free program, then maybe my growing company should use it too.

If anyone is brave enough to sign up with me, let me know, and we can work together on any tough problems we encounter.

Supply Chain Cowboy ApprovedThat’s my thought for the week. The internet and improvements in technology have given us the challenge and opportunity of big data. Similarly, the internet and Stanford has given us a free way to learn how to surmount that challenge. Pretty cool, and definitely Supply Chain Cowboy approved.

If big data is something that interests you, here’s a recent, related article: Getting Started with Big Data in a Small Business