Big data architecture and infrastructure

What precisely is covered by the term "big data"? And how do companies benefit from analysing it? Come and learn this (and more) during this one-day course at ABIS.

These days everybody seems to be working with "big data". But what does this mean precisely? What kind of data are we speaking about? Which infrastructure does one need for it? And what does it buy us? During this training, we are pursuing answers to these questions!

Data is gradually becoming more and more vital to any kind of enterprise. Analysing large amounts of data aimed at optimizing enterprise processes, marketing, important decisions, ... is not new. But because of the steadily increasing data volumes, the increasing diversity of data sources, and the broader availability of data, such an analysis is expecting always more from the infrastructure, the software, and the data models. In so far even that it seems like a new framework will be necessary. The traditional, established relational model seems to fall short in describing and guiding the new challenges of "data analysis for business intelligence".

"Big data analytics" is the name of this coordinating framework, in which both old models and techniques (like date warehousing, online analytic processing, Hadoop, cluster analysis, ...) and newer insights (data in motion, emotional text analytics, ...) have found each other. The capability to condense relevant insights from more diverse, larger, and rapidly changing data, can help managers and other decision makers to better support their decisions.

This course

gives a general picture of big data and what it represents;
gives an overview of the technologies on which it is based;
goes through the frequently heard technological terms which we need to get acquainted with;
places these terms in context and perspective.

Schedule a training?

Delivered as a live, interactive training – available in-person or online, or in a hybrid format. Training can be implemented in English, Dutch, or French.

REQUEST IN-COMPANY TRAINING

Public training calendar

No public sessions are currently scheduled. We will be pleased to set up an on-site course or to schedule an extra public session (in case of a sufficient number of candidates). Interested? Please let us know.

Intended for

The course is designed for everybody who wants to learn about big data: IT personnel, people confronted with big data technologies. Also for non (IT) technical collaborators.

Background

Elementary knowledge of database management systems is an advantage.

Main topics

Introduction: about data, databases, and data warehouses - and now big data
What is big data?
- Perspective: problem formulation - why big data?
  - data centric management
  - the 4 Vs: volume, variety, velocity, variability - types of data - examples
  - data quality, consistency, and reliability (veracity)
- Big data architecture - components - technologies - towards an integrated data architecture
Overview of new data sources: web statistics ("click streams"); social media; Twitter feeds; Google Maps; sensor data (e.g. surveillance cameras) ant the Internet of Things (IoT); ...
NoSQL databases versus relational databases - types and use - and popular today: MongoDB, Cassandra, ...
Big Data Frameworks
- The "divide & conquer" model: Hadoop and MapReduce - distribute data and analyse it through massively parallel algorithms
- Spark: in-memory hence speed - supporting a plethora of data sources
- Machine learning
Performance considerations
Big data analytics - know your data -- or: the role of the data scientist!
- How to judge data quality; risk analysis - and the importance of statistics
- Use of programming languages: Python, R, Scala, ...
- Use of visualisation tools in order to keep an overview and to estimate the relative importance of the different data sources
Overview of often used (open source) products/technologies on the market

Training method

Classroom training.

Certificate

At the end of the session, the participant receives a "Certificate of Completion".

Duration

two half days.

Course leader

Peter Vanroose (ABIS), Kris Van Thillo (ABIS).

SESSION INFO AND ENROLMENT