I'm Mark Graus, and I'm Human-Centric. For over 10 years I have been working with data from websites, apps, online services and smart devices to improve them by understanding what users want and need. I am available as data scientist or data science consultant to help you in leveraging your data.
I have two main beliefs:
1. Data Science should be accessible to anyone: I try to demystify data science and help organizations in leveraging their data.
2. Data Science should improve quality of life: I do everything human-centric and try to raise awareness to human-centric data science.
I work mostly with data that is the result of people interacting with websites, apps or services. I am convinced the only way to do that properly is by assuming a human-centric perspective and thoroughly understanding the people whose data is used and their needs, goals and interests. This is necessary to properly understand the data, but also to know how data can, should and also should not be used.
As data scientist I have always incorporated fundamental psychological theory in data solutions and validated the technology resulting from my work through user studies to understand the effects on subjective user experience.
Collecting data is one part, making sense of it is another challenge. Through Exploratory Data Analysis (EDA) we extract useful, relevant insights from our data. What navigation behavior is predictive of purchase behavior? Which customers show behavior that indicated they may quit your service?
Conducting an EDA starts with a dataset and ideally a question or a research direction. The output is an analysis conducted in R or Python, providing the answer to a question and whatever relevant findings were encountered.
Predictive modeling is used to create a model that makes predictions on what future behavior an individual is most likely to exhibit, based on what they have done in the past. These predictions can for example be used for targeting, recommending, or providing a next best action.
A predictive model requires available data. In tooling such as Python a pipeline is constructed, which is deployed to make the output of predictions available in real-time.
A/B Testing (or online controlled experimentation) is the practice of comparing different versions of a system to see which works better. This could be two different designs of the same website, or a version with some new functionality against a version without.
Conducting an A/B test consists of planning what versions, deciding how long to run, implementing data collection, and analyzing and interpreting the results. Various tools such as Google Optimize and Omniture Test&Target allow you to do this.
Exploratory data analysis is the activity of using data to gain an understanding of what is happening. Exploratory data analysis can provide answers to questions such as "how much do users use my service on average?" or "what segments of users can we distinguish in terms of behavior?"
A good exploratory data analysis describes the data, the analysis and the findings. Especially when using tools such as R and Python, the analysis can be performed and documented in the same tools, allowing the analyst to provide the results and the way those results were produced a notebook, providing not only the results, but also the code to alter or expand.
Predictive modeling is using the fact that patterns in the data can be used to make predictions. It might for example be used to predict if a visitor on a website is likely to make a purchase based on the behavior they exhibit. Or it might be used to predict if a user is likely to enjoy a certain song or movie in a multimedia streaming service. These predictions can then be used to help the user.
Predictive modeling requires data. It requires historical data in order to be able to make the predictive model, but it also requires real-time, or recent, data to make the predictions about. The model itself is deployed in your infrastructure and provides predictions in real-time based on the data it's being fed.
The proof of the pudding is in the eating. Once you collect data and use this data to come up with improvements of your systems or products, it is a straightforward step to use this same data to evaluate this improvement. Online controlled experimentation of A/B testing requires knowledge of statistics, technology, experimental design and the whole testing process itself. All of the work I have done a difficult you If you have data that relates to how people interact with your products, your brand, your websites or any other system, I can help you leverage this data. For over a decade I have been working with this type of data to gain understanding of the users, and at times to alter the technology in real time to cater to the inferred needs.