Who really is a data scientist, what does he do and how do you become a data scientist?
Who really is a data scientist, what does he do and how do you become a data scientist?
Artificial intelligence (AI) is a hot topic that stirs the imagination. AI is curious, but it can also instill fear of the unknown – robots that have a chance to take over the world in the future. However, there are always people behind any technology. It is their knowledge and experience in the field of data science and machine learning that makes it easier for us to use, among others, popular websites such as YouTube, Pracuj.pl, AirBnB, Twitter, Allegro, Instagram or Facebook – what does the “data researcher” have to do with them?Read also: Will artificial intelligence and data science deprive an actuary of a job? Data science, or data science, is a field that focuses on recognizing repeating patterns in data sets, but also transforming them in such a way that those patterns are more easily noticeable. In other words – data science helps to discover insights in data, sometimes deeply hidden. Thanks to this domain, we can create algorithms that allow to implement many practical functions, such as the conversion of speech into written text, which is used, among other things, in the keyboards of Android smartphones, where to press the microphone icon lets you dictate text instead of typing it.Also read: Companies are looking for data science employees. In Poland, specialists earn an average of 9 thousand. PLN monthly >>>Another example of a solution are the product suggestions in the online store. The analysis of thousands, and in the case of Amazon – millions of transactions made by other users, allows us to suggest products related to our choice, which we may possibly need.
A data scientist is a person whose main task is to understand data, develop and refine an algorithm that will solve a specific problem reported by the customer. In addition to working with large datasets, the data scientist is also able to communicate effectively with managers and other stakeholders who define a specific problem and the expected results of the algorithm from incorrect inputs and not consistent. What distinguishes the tasks of a data scientist from the tasks of a programmer or a business analyst is to make inferences on the basis of already processed data and to select the appropriate algorithm. Knowing mathematical statistics, probability, and linear algebra helps here, but that’s not all. It is necessary to be familiar with the specific methods and algorithms used, especially in machine learning, i.e. knowledge of models based on neural networks or recently fashionable random forests – The data scientist does not is not just a data analysis specialist with strong programming and code testing skills. In the job market, soft skills, such as empathy, understanding professional customer needs, and strong communication skills, are also important, which will help explain applied solutions to people without technical preparation. The models used are often so complicated that it is not easy to convince the people who make the final decision to implement a given algorithm that it will be the right decision. Another important quality of a data scientist is patience. The process of learning algorithms can take a very long time even on the fastest computers and must be repeated several times until a satisfactory effect is achieved – explains Dr. Piotr Szajowski, Data Scientist from the Wrocław office of Capgemini Polska. Hard data proves the growing demand for this type of specialists – Data science is a very dynamically developing branch of the IT industry, which has been perfectly illustrated in recent months. At the beginning of 2019, on our website, there were only a dozen ads in the “Dating” category, while currently there are almost 50 offers from all over Poland. Along with the increase in the number of advertisements, the number of candidates applying for individual positions has also increased. Comparing the beginning of the year with the current situation, we recorded an almost 5-fold increase in the average number of applications for each position – explains Michał Szum, Customer Care Team Leader at JustJoinIt. – The most sought after are Data Scientists and Business Analysts who do not necessarily know programming technology, and if it appears in the requirements, SQL and Python are most often used. What is crucial for this type of position is above all the ability to think analytically, knowledge of statistical problems or understanding of the information that large data sets, i.e. big data, can provide – says Michał Szum, Customer Service Team Leader at JustJoinIt. A problem that has recently appeared quite often in various projects in the field of artificial intelligence is anomaly detection. Our customers order and maintain large, complex computer systems from us, and these – by their nature – require a rapid response to any observed deviations from normal operation, loss of reputation and other, usually unpleasant consequences. Automatic detection of anomalies in the operation of systems is a hot topic and in our team someone is constantly working on these issues. Both methods known in the traditional fields of mathematical statistics and the relatively new models based on neural networks developed in recent years are useful here – explains Marcin Stachowiak, Head of Artificial Intelligence Development at Capgemini Polska.In many processes business, image analysis using vision systems is a key element. question that allows you to assess whether, for example, a product coming off the production line is complete. Capgemnini in Poland has already prepared a prototype device that could be used in retail in the future. It is a smart shopping cart that can determine which products are there based on the camera image. Algorithms in this field, which have been developed in recent years, now allow image recognition with a level of precision that already exceeds human capabilities. Each project is supported by a team of professionals, including data scientists, whose knowledge and experience enable the development of increasingly interesting and practical applications of AI that influence and will influence the way we operate in many aspects of our lives in the future. Their work will also result in the growing awareness of these applications in the business world, and therefore – in their dissemination. Some claim that artificial intelligence will change our lives, just as the steam engine, electricity, computers, the Internet and, more recently, mobile devices have changed civilization. The development of data scientists depends to a large extent on the realization of these predictions.