|Lessons Learned for the Data-Driven Business|
|This book has two main goals: to define data science through the work of data scientists and their results, namely data products, while simultaneously providing the reader with relevant lessons learned from applied data science projects at the intersection of academia and industry. As such, it is not a replacement for a classical textbook (i.e., it does not elaborate on fundamentals of methods and principles described elsewhere), but systematically highlights the connection between theory, on the one hand, and its application in specific use cases, on the other.|
|Inference, Causal and Stochastic Analysis|
|STATISTICS (Statistical Science) can be briefly described as the science of problem solving and decision making based on data, being observed in all the sciences of humanity, in governments and corporate bodies. Statistical methods essentially reveal important information and knowledge in data sets monitored in application domains such as quality and process control, business governance, supply chain management, urban traffic management, industrial manufacturing and public health. DATA ANALYTICS is a series of texts that provide the essential concepts, statistical methods and practical approaches for problems in sciences, engineering and technology. The main point is to develop Statistical Science-based solutions in specific areas of engineering and technology, in the context of sustainable economic development. The book is written to honestly convey basic probabilistic models and statistical methods to readers in sectors of computing, industry, engineering, production, management, environmental and actuarial sciences. Hopefully the book is a good support for students and professionals, who are finding practical solutions, or making optimal decisions using actual observed data.|
|A Practical Introduction to Data Science with Python|
|Data science underlies Amazon’s product recommender, LinkedIn’s People You Know feature, Pandora’s personalized radio stations, Stripe’s fraud detectors, and the incredible insights arising from the world’s increasingly ubiquitous sensors. In the future, the world’s most interesting and impactful problems will be solved with data science. But right now, there’s a shortage of data scientists in every industry, traditional schools can’t teach students fast enough, and much of the knowledge data scientists need remains trapped in large tech companies.
This comprehensive, practical tutorial is the solution. Drawing on his experience building Zipfian Academy’s immersive 12-week data science training program, Jonathan Dinu brings together all you need to teach yourself data science, and successfully enter the profession.
First, Dinu helps you internalize the data science ‘mindset’: that virtually anything can be quantified, and once you have data, you can harvest amazing insights through statistical analysis and machine learning. He illuminates data science as it really is: a holistic, interdisciplinary process that encompasses the collection, processing, and communication of data: all that data scientists do, say, and believe.
|Unprecedented Paradigmatic Shifts and Practical Advancements|
|We are living at the dawn of what has been termed ‘the fourth paradigm of science,’ a scientific revolution that is marked by both the emergence of big data science and analytics, and by the increasing adoption of the underlying technologies in scientific and scholarly research practices. Everything about science development or knowledge production is fundamentally changing thanks to the ever-increasing deluge of data. This is the primary fuel of the new age, which powerful computational processes or analytics algorithms are using to generate valuable knowledge for enhanced decision-making, and deep insights pertaining to a wide variety of practical uses and applications. This book addresses the complex interplay of the scientific, technological, and social dimensions of the city, and what it entails in terms of the systemic implications for smart sustainable urbanism. In concrete terms, it explores the interdisciplinary and transdisciplinary field of smart sustainable urbanism and the unprecedented paradigmatic shifts and practical advances it is undergoing in light of big data science and analytics. This new era of science and technology embodies an unprecedentedly transformative and constitutive power – manifested not only in the form of revolutionizing science and transforming knowledge, but also in advancing social practices, producing new discourses, catalyzing major shifts, and fostering societal transitions. Of particular relevance, it is instigating a massive change in the way both smart cities and sustainable cities are studied and understood, and in how they are planned, designed, operated, managed, and governed in the face of urbanization. This relates to what has been dubbed data-driven smart sustainable urbanism, an emerging approach based on a computational understanding of city systems and processes that reduces urban life to logical and algorithmic rules and procedures, while also harnessing urban big data to provide a more holistic and integrated view or synoptic intelligence of the city. This is increasingly being directed towards improving, advancing, and maintaining the contribution of both sustainable cities and smart cities to the goals of sustainable development.|
|Data Science for Supply Chain Forecast is a book for practitioners focusing on data science and machine learning; it demonstrates how both are closely interlinked in order to create an advanced forecast for supply chain. As one will discover in this book, artificial intelligence (AI) & machine learning (ML) are not simply a question of coding skills. Using data science in order to solve a problem requires a scientific mindset more than coding skills. The story behind these models is one of experimentation, of observation and of constant questioning; a true scientific method must be applied to supply chain. In the data science field as well as that of the supply chain, simple questions do not come with simple answers. In order to resolve these questions, one needs to be both a scientist as well as to use the correct tools. In this book, we will discuss both.|
|This book presents cutting edge research on the new ethical challenges posed by biomedical Big Data technologies and practices. ‘Biomedical Big Data’ refers to the analysis of aggregated, very large datasets to improve medical knowledge and clinical care. The book describes the ethical problems posed by aggregation of biomedical datasets and re-use/re-purposing of data, in areas such as privacy, consent, professionalism, power relationships, and ethical governance of Big Data platforms. Approaches and methods are discussed that can be used to address these problems to achieve the appropriate balance between the social goods of biomedical Big Data research and the safety and privacy of individuals. Seventeen original contributions analyse the ethical, social and related policy implications of the analysis and curation of biomedical Big Data, written by leading experts in the areas of biomedical research, medical and technology ethics, privacy, governance and data protection. The book advances our understanding of the ethical conundrums posed by biomedical Big Data, and shows how practitioners and policy-makers can address these issues going forward.|
|Missing Data Analysis in Practice provides practical methods for analyzing missing data along with the heuristic reasoning for understanding the theoretical underpinnings. Drawing on his 25 years of experience researching, teaching, and consulting in quantitative areas, the author presents both frequentist and Bayesian perspectives. He describes easy-to-implement approaches, the underlying assumptions, and practical means for assessing these assumptions. Actual and simulated data sets illustrate important concepts, with the data sets and codes available online. The book underscores the development of missing data methods and their adaptation to practical problems. It mainly focuses on the traditional missing data problem. The author also shows how to use the missing data framework in many other statistical problems, such as measurement error, finite population inference, disclosure limitation, combing information from multiple data sources, and causal inference.|
|A Unified Approach and Tool Support|
|Andrey Kolesnikov proposes an interesting unified approach and corresponding tools for modelling and effective generation of realistic workloads and traffic in networks. As a result of the general applicability in IP-based networks, the outcome of his research can be used for different service interfaces in combination with various workload models and modelling techniques. His work is both broad and deep in focus in order to demonstrate the application of the proposed approach in different realistic scenarios.|
|Computational Methods for Numerical Analysis with R is an overview of traditional numerical analysis topics presented using R. This guide shows how common functions from linear algebra, interpolation, numerical integration, optimization, and differential equations can be implemented in pure R code. Every algorithm described is given with a complete function implementation in R, along with examples to demonstrate the function and its use. Computational Methods for Numerical Analysis with R is intended for those who already know R, but are interested in learning more about how the underlying algorithms work. As such, it is suitable for statisticians, economists, and engineers, and others with a computational and numerical background.|
|Big data technologies are used to achieve any type of analytics in a fast and predictable way, thus enabling better human and machine level decision making. Principles of distributed computing are the keys to big data technologies and analytics. The mechanisms related to data storage, data access, data transfer, visualization and predictive modeling using distributed processing in multiple low cost machines are the key considerations that make big data analytics possible within stipulated cost and time practical for consumption by human and machines. However, the current literature available in big data analytics needs a holistic perspective to highlight the relation between big data analytics and distributed processing for ease of understanding and practitioner use. This book fills the literature gap by addressing key aspects of distributed processing in big data analytics. The chapters tackle the essential concepts and patterns of distributed computing widely used in big data analytics. This book discusses also covers the main technologies which support distributed processing. Finally, this book provides insight into applications of big data analytics, highlighting how principles of distributed computing are used in those situations. Practitioners and researchers alike will find this book a valuable tool for their work, helping them to select the appropriate technologies, while understanding the inherent strengths and drawbacks of those technologies.|