FAQ and False Myths
We’ve gathered a list of the most frequently asked questions and false myths we intend to dispel.
If you have any further questions or if you are interested in how your problem can be solved, you are welcome to contact us by e-mail to info@zurich-data-scientists.ch without any obligation on your part.
Data (Security)
Data protection and confidentiality is a core element at ZDS. All at ZDS are regularly trained in handling highly sensitive data to ensure the safety, integrity and privacy of your data. We have long-standing experience with, among others, medical, personal and industrial data.
All the data entrusted to us are exclusively stored and processed in Switzerland. The data are stored on encrypted storages with physical security control. For receiving or sending data, we offer encrypted communication. To restrict data access internally to only those people related to your project, we implemented Discretionary Access Control (DAC).
Yes, we can come to your place for our work if this would suit your requests better.
We safely remove your data from all our systems on your request at any time. If another follow-up project is likely in the future, we can also keep your data safe in the meantime.
You will stay the owner of all your data entrusted to us. Your data will not be used for any other purpose or sent to a third party without your explicit permission.
This is totally up to you, we accept every file format you might have your data in.
No, generally we can adapt our statistical modeling to the data at hand. Whether the number of data points is indeed too low for any relevant output can be discussed in your specific case.
We work on an established IT-system with implemented technical and organizational measures to warrant long-term confidentiality, integrity, and safety of all data we handle. We are data-driven in every aspect of our company. We created our HR related tools and software ourselves and also offer consultation about such systems/process automation.
ZDS (Way of Working)
Just drop us an email or reserve a non-binding and free of charge first online chat with us and we can elaborate on your specific situation. Through our long-term experience in the field we have seen many different projects in a variety of industries, private companies and research areas. This helps us to quickly understand and assess your request.
Yes, definitely.
No, we leverage on the useful part that is already existing and try to improve the insights, also with guidance for future projects. Making big changes to existing projects is often not an efficient way to work. A lot of projects we work on are good, but a fresh and objective view can often lead to large improvements in relatively simple ways. As such, we greatly improve the quality of our output.
Just reach out to us and we can elaborate on your specific setting and make you a non-binding offer.
ZDS has an office in Zurich, Switzerland. We offer consultations online or on-site at your place anywhere in Switzerland.
Yes, people (team members and clients) are most important for us. For many tasks, we work with the “four-eyes principle,” that is always at least two people check the work. This greatly improves the quality of the gained insights.
Yes, teaching is the best way to train your communication skills, which are extremely important in data science. However, we will introduce you to it, and you don’t need to be an academic lecturer already (although some experience in that area is obviously a plus).
False Myths
This is not true, but we often find that data science is perceived this way. In a project with sub-optimal data collection, data science cannot extract all the desired insights from the data.
ZDS supports you in answering the question of what is and what is not possible in such situations, taking into account state-of-the-art analysis techniques. We also provide transparent advice on how to adapt your experimental design or change the way you collect data to gain the most from future projects.
This is not true. If the aim is forecasting, the choice of the predictive model is crucial to ensure that the actions that are taken depending on its outputs are favorable. If the aim is estimation of effects, one should choose a suitable statistical model carefully, otherwise nonsensical results will be derived.
Time and again, we hear about other data scientists’ work. Upon closer inspection, very often we find mistakes in their work which they are not aware of. One example is calculating “a number” that does not adequately answer the research question or that is the product of misconceptions and biases. A data science expert needs to be able to understand a problem in its full detail, to formalise it mathematically, and find a suitable statistical / machine learning model from the huge universe of options. At ZDS, we are all trained statisticians with years of education and experience to exactly fulfil these requirements.
Be careful! Of course, there are some situations and analysis techniques that are easier to interpret. However, we have seen many examples where the analysis was done correctly, but the interpretation did not fit the data and the statistical analysis. This is of course a great pity when a lot of good work has been done but the “final” step of interpretation goes wrong.
At ZDS, we focus on a correct, very clear and understandable explanation of the results in the respective real world context, and also explain where the limitations lie.
This is only partly true. At ZDS we have seen many cases where the data collection has been suboptimal or wrong concepts have been used, jeopardising entire expensive projects for some clients. The best approach would be to involve a statistician from the start of your project to support you in planning your data collection. The gold standard of the procedure until the collection of the data would be:
- Precisely define your research questions.
- Translate them into mathematical language.
- Define the statistical analyses plan and the data you would need.
- Start collecting the data.
Following this procedure will ensure that there is no mismatch between the data and the questions you are trying to answer, and will give you the best basis for a successful project.
This is not recommended. Unfortunately, it happens all too often that the statistician has to try to “save” a project as best they can, rather than simply answering the project’s objectives.
“To consult the statistician after an experiment is finished is often merely to ask them to conduct a post mortem examination. They can perhaps say what the experiment died of.” (famous quote from R. Fisher)
Data science also involves the planning stages, how and what data needs to be collected, what research questions can be answered, how to be most efficient. The results can be very sobering, or the whole project may even fail, if these points are not taken into account.
At ZDS, we tell you what is feasible and how to achieve it. This saves enormous costs.
In general, this is not necessarily true. A so-called “in-sample” performance, where you train and evaluate the model on the same data, can be largely off from the performance of the model that you would obtain if you used it on new data. A model or machine learning method can ‘overfit’, which means that it will only work properly on the data you trained it on.
Often, a model is fitted with the aim of using it later with new data, let’s say to make predictions for new data from time to time. Thus, a model’s performance has to reflect how well the model would perform on new, unseen data. This “generalisation performance” cannot be known (we do not know the new data yet), but it can be and needs to be estimated, in order to get a realistic view of the model.
At ZDS, we are experts in estimating the generalisability of models, setting up a framework for estimation that is tailored to the exact setting in which the model will be used. This ensures that the model is really useful for your daily business and brings the desired added value.
This is not true. Machine learning has been developed primarily to make precise predictions. However, this is a very different task from statistical inference, where the goal is to estimate certain effect sizes and quantify the (un)certainty in these estimates. Such goals require interpretable methods, and many modern “classical” statistical methods (which, by the way, can also be very flexible!) are designed to do just that.
Recent methods and frameworks have brought the predictive power of machine learning to statistical inference problems, and at ZDS we know these methods well due to our close ties to current research groups.
Unfortunately not. Some people think that with enough data and the right algorithms, machine learning can predict any future event with high accuracy.
However, in reality, predictive models are limited by the quality of the data, the nature of the problem, and inherent uncertainty. Some events are inherently unpredictable. Data science is a powerful tool, but it’s not a “cure-all” and must be applied correctly.
ZDS helps you making the best predictions possible and discusses with you what the limitations of your project are.