• Calender

    April 2016
    M T W T F S S
    « Mar   May »
     123
    45678910
    11121314151617
    18192021222324
    252627282930  
  • Contact

    Send us press releases, events and new product information for publication..

    Email: nchuppala@ outlook.com

The Life of a Data Scientist

http://www.mastersindatascience.org/careers/data-scientist/

Data scientists are big data wranglers. They take an enormous mass of messy data points (unstructured and structured) and use their formidable skills in math, statistics and programming to clean, massage and organize them. Then they apply all their analytic powers – industry knowledge, contextual understanding, skepticism of existing assumptions – to uncover hidden solutions to business challenges.

Data Scientist Responsibilities

“A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.”

On any given day, a data scientist may be required to:

  • Conduct undirected research and frame open-ended industry questions
  • Extract huge volumes of data from multiple internal and external sources
  • Employ sophisticated analytics programs, machine learning and statistical methods to prepare data for use in predictive and prescriptive modeling
  • Thoroughly clean and prune data to discard irrelevant information
  • Explore and examine data from a variety of angles to determine hidden weaknesses, trends and/or opportunities
  • Devise data-driven solutions to the most pressing challenges
  • Invent new algorithms to solve problems and build new tools to automate work
  • Communicate predictions and findings to management and IT departments through effective data visualizations and reports
  • Recommend cost-effective changes to existing procedures and strategies

Every company will have a different take on job tasks. Some treat their data scientists as glorified data analysts or combine their duties with data engineers; others need top-level analytics experts skilled in intense machine learning and data visualizations.

As data scientists achieve new levels of experience or change jobs, their responsibilities invariably change. For example, a person working alone in a mid-size company may spend a good portion of the day in data cleaning and munging. A high-level employee in a business that offers data-based services may be asked to structure big data projects or create new products.

An Interview with a Real Data Scientist

airbnbWe caught up with Lisa Qian, Data Scientist at Airbnb, to find out what it’s like to work as a data scientist. Read on to learn about the impact data science has on Airbnb’s success, the programming languages they use on the job, and what students need to know in order to succeed.

Q: What are the top pros & cons of your job?
A: Things happen very quickly and data scientists have a big impact (see answer to next question). At Airbnb, there are so many interesting problems to work on and so much interesting data to play with. The culture of the company also encourages us to work on lots of different things. I have been at Airbnb for less than two years and I have already worked on three completely different product teams. There’s really never a dull moment. This can also be a “con” of the job. Because there are so many interesting things to work on, I often wish that I had more time to go more in depth on a project. I’m often juggling multiple projects at once, and when I’m 90% done with one of them, I’ll just move on to something else. Coming from academia where one spends years and years on one project without leaving a single rock unturned (I did a PhD in physics), this has been a delightful, but sometimes frustrating, cultural transition.
Q: How much of an impact do data scientists have on Airbnb’s overall success?
A: A ton! As a data scientist, I’m involved in every step of a product’s life cycle. For example, right now I am part of the Search team. I am heavily involved in research and strategizing where I use data to identify areas that we should invest in and come up with concrete product ideas to solve these problems. From there, if the solution is to come up with a data product, I might work with engineers to develop the product. I then design experiments to quantify the effect and impact of the product, and then run and analyze the experiment. Finally, I will take what I learned and provide insights and suggestions for the next product iteration. Every product team at Airbnb has engineers, designers, product managers, and one or more data scientists. You can imagine the impact data scientists have on the company!
Q: Which skills or programming languages do you most frequently use in your work, and why?
A: At Airbnb, we all use Hive (which is similar to SQL) to query data and build derived tables. I use R to do analysis and build models. I use Hive and R every day of the job. A lot of data scientists use Python instead of R – it’s just a matter of what we were familiar with when we came in. There have also been recent efforts to use Spark to build large-scale machine learning models. I haven’t gotten a chance to try it out yet, but plan on doing so in the near future. It seems very powerful.
Q: What kind of person makes the best data scientist?
A: Successful data scientists have a strong technical background, but the best data scientists also have great intuition about data. Rather than throwing every feature possible into a black box machine learning model and seeing what comes out, one should first think about if the data makes sense. Are the features meaningful, and do they reflect what you think they should mean? Given the way your data is distributed, which model should you be using? What does it mean if a value is missing, and what should you do with it? The answers to these questions differ depending on the problem you are solving, the way the data was logged, etc., and the best data scientists look for and adapt to these different scenarios.The best data scientists are also great at communicating, both to other data scientists and non-technical people. In order to be effective at Airbnb, our analyses have to be both technically rigorous and presented in a clear and actionable way to other members of the company.

Q: What advice would you offer students preparing for a position as a data scientist?
A: Beyond taking programming and statistics courses, I would recommend doing everything possible to get your hands dirty and work with real data. If you don’t have the time to do an internship, sign up to participate in hackathons or offer to help out a local startup by tackling a data problem they have. Courses and books are great for developing fundamental technical skills, but many data science skills can’t be properly developed in a classroom where data sets are well groomed.

Data Scientist Salaries

The term “data scientist” is the hottest job title in the IT field – with starting salaries to match. It should come as no surprise that Silicon Valley is the new Jerusalem. According to a 2014 Burtch Works study, 36% of data scientists work on the West Coast. Entry-level professionals in that area earn a median base salary of $100,000 – 22% more than their Northeast peers.

Data Scientist

Glassdoor
Average Salary (2015): $118,709 per year
Minimum: $76,000
Maximum: $148,000

PayScale
Median Salary (2015): $93,991 per year
Total Pay Range: $63,524 – $138,123

Senior Data Scientist

PayScale
Median Salary (2015): $124,273 per year
Total Pay Range: $89,801 – $179,445

Data Scientist Qualifications

What Kind of Degree Will I Need?

Broadly speaking, you have 3 education options if you’re considering a career as a data scientist:

  1. Degrees and graduate certificates provide structure, internships, networking and recognized academic qualifications for your résumé. They will also cost you significant time and money.
  2. MOOCs and self-guided learning courses are free/cheap, short and targeted. They allow you to complete projects on your own time – but they require you to structure your own academic path.
  3. Bootcamps are intense and faster to complete than traditional degrees. They may be taught by practicing data scientists, but they won’t give you degree initials after your name.

Academic qualifications may be more important than you imagine. As Burtch Works notes, “it’s incredibly rare for someone without an advanced quantitative degree to have the technical skills necessary to be a data scientist.”

In its data science salary report, Burtch Works determined that 88% of data scientists have a master’s degree and 46% have a PhD. The majority of these degrees are in rigorous quantitative, technical or scientific subjects, including math and statistics (32%), computer science (19%) and engineering (16%).

With that being said, companies are desperate for candidates with real-world skills. Your technical know-how may trump preferred degree requirements.

Note: Check out our list of 23 Great Schools with Master’s Programs in Data Science.

What Kind of Skills Will I Need?

Technical Skills

  • Math (e.g. linear algebra, calculus and probability)
  • Statistics (e.g. hypothesis testing and summary statistics)
  • Machine learning tools and techniques (e.g. k-nearest neighbors, random forests, ensemble methods, etc.)
  • Software engineering skills (e.g. distributed computing, algorithms and data structures)
  • Data mining
  • Data cleaning and munging
  • Data visualization (e.g. ggplot and d3.js) and reporting techniques
  • Unstructured data techniques
  • R and/or SAS languages
  • SQL databases and database querying languages
  • Python (most common), C/C++ Java, Perl
  • Big data platforms like Hadoop, Hive & Pig
  • Cloud tools like Amazon S3

This list is always subject to change. As Anmol Rajpurohit suggests, “generic programming skills are a lot more important than being the expert of any particular programming language.”

Business Skills

  • Analytic Problem-Solving: Approaching high-level challenges with a clear eye on what is important; employing the right approach/methods to make the maximum use of time and human resources.
  • Effective Communication: Detailing your techniques and discoveries to technical and non-technical audiences in a language they can understand.
  • Intellectual Curiosity: Exploring new territories and finding creative and unusual ways to solve problems.
  • Industry Knowledge: Understanding the way your chosen industry functions and how data are collected, analyzed and utilized.

Note: You can view a handy trajectory on How to Become a Data Scientist in an infographic from Datacamp.

What About Certifications?

To avoid wasting time on poor quality certifications, ask your mentors for advice, check job listing requirements and consult articles like Tom’s IT Pro “Best Of” certification lists. Here are a few that focus on useful skills:

Certified Analytics Professional (CAP)

CAP was created in 2013 by the Institute for Operations Research and the Management Sciences (INFORMS) and is targeted towards data scientists. During the certification exam, candidates must demonstrate their expertise of the end-to-end analytics process. This includes the framing of business and analytics problems, data and methodology, model building, deployment and life cycle management.

Requirements:

  • 5+ years of analytics work-related experience for BA/BS holder in a related area
  • 3+ years of analytics work-related experience for MA/MS (or higher) holder in a related area
  • 7+ years of analytics work-related experience for BA/BS (or higher) holder in an unrelated area
  • Verification of soft skills/provision of business value by employer
  • Agreement to adhere to Code of Ethics

Cloudera Certified Professional: Data Scientist (CCP:DS)

Targeted towards the elite level, the CCP:DS is aimed at data scientists who can demonstrate advanced skills in working with big data. Candidates are drilled in 3 exams – Descriptive and Inferential Statistics, Unsupervised Machine Learning and Supervised Machine Learning – and must prove their chops by designing and developing a production-ready data science solution under real-world conditions.

Related Cloudera certifications include:

EMC: Data Science Associate (EMCDSA)

The EMCDSA certification tests your ability to apply common techniques and tools required for big data analytics. Candidates are judged on their technical expertise (e.g. employing open source tools such as “R”, Hadoop, and Postgres, etc.) and their business acumen (e.g. telling a compelling story with the data to drive business action).

Once you’ve passed the EMCDSA, you can consider the Advanced Analytics Specialty. This works on developing new skills in areas such as Hadoop (and Pig, Hive, HBase), Social Network Analysis, Natural Language Processing, data visualization methods and more.

SAS Certified Predictive Modeler using SAS Enterprise Miner 7

This certification is designed for SAS Enterprise Miner users who perform predictive analytics. Candidates must have a deep, practical understanding of the functionalities for predictive modeling available in SAS Enterprise Miner 7 before they can take the performance-based exam. This exam includes topics such as data preparation, predictive models, model assessment and scoring and implementation.

Related SAS certifications include:

Jobs Similar to Data Scientist

Some data scientists get their start working as low-level Data Analysts, extracting structured data from MySQL databases or CRM systems, developing basic visualizations or analyzing A/B test results. These jobs aren’t usually that challenging.

However, once you have your technical skills in order, you have plenty of options. If you’d like to push beyond your analytical role, you could think about building/engineering/architecture jobs such as:

Data Scientist Job Outlook

In an oft-cited 2011 big data study, McKinsey reported that by 2018 the U.S. could face a shortage of 140,000 to 190,000 “people with deep analytic skills” and 1.5 million “managers and analysts with the know-how to use the analysis of big data to make effective decisions.”

The ensuing panic has led to high demand for data scientists. Companies of every size and industry – from Google, LinkedIn and Amazon to the humble retail store – are looking for experts to help them wrestle big data into submission. Starting salaries are astronomical.

The bubble is bound to burst, of course. In a 2014 Mashable article, Roy Lowrance, the managing director of New York University’s Center for Data Science program, is quoted as saying “anything that gets hot like this can only cool off.” But even as demand for data engineers surges, job postings for big data experts are expected to remain high.

There are also some indications that the roles of data scientists and business analysts are beginning to merge. In certain companies, “new look” data scientists may find themselves responsible for financial planning, ROI assessment, budgets and a host of other duties related to the management of an organization.

Professional Organizations for Data Scientists

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: