home 
 
 
 
ennl
 
Home
A propos
Services
Cours
Ressources
Contacts
MyABIS
C
Tous les coursbalkjeGénéralités » Introduction HW & SW » Soft skills » Cycles completsSystèmes d’exploitation » MVS - z/OS » UNIX - Linux - AIX » Mac OS X » iPad et iPhone iOSBases de données et middleware » Relational databases & SQL » DB2 for z/OS » DB2 for LUW » Oracle » SQL Server » MySQL & MariaDB » IMS » CICS » IBM MQ » WebSphere » Big data et analyticsDéveloppement d’applications » Méthodiques et techniques » TOGAF » PRINCE2 » Agile et Scrum » Les langages de programmation » Internet development » Object Oriented systems » Java » Development tools » SAS » XML » SOA & web servicesGestion de système » ITIL » SecuritybalkjeEn pratiqueInscriptions 
Cette page n'est pas disponible en français.
R for data analytics

Objectives

Data analytics for business intelligence starts with collecting, storing and cleverly summarizing enterprise data, which nowadays is generated by a diversity of data sources (click streams, social media, relational data, sensor data, ...)

A popular tool for this kind of analytics is R. Its popularity is partly explained because it's free open source software, but more importantly because an increasing number of add-on packages are made available which focus on particular use cases in this broad BI and Big Data universe.

This course will give you hands-on practice with R, both as a data analytics and graphical tool, and as a programming and scripting environment where you can let the system give you any possible insight into your data that you may want.

Main topics

Part I - R fundamentals

  • Getting started
  • installing R (Linux / Windows / MAC)
  • getting to learn the command line interface and the Rstudio GUI
  • first steps with R: interactive commands; obtaining online help
  • basic concepts: expressions (numeric, textual); commands & functions; variables & assignment
  • R basics
  • "atomic" data types and how to write their constants: double (numeric), character, integer, logical
  • numeric and logical operators
  • the special values Inf, NaN, NA
  • the vector type; operator "c()"; so-called coercing; vector operators
  • the "package" concept of R
  • CRAN and www.r-project.org
  • More "structural" data types
  • lists (hierarchical data) and matrices
  • Functions and attributes
  • positional and named parameters
  • creating your own functions
  • R scripts; the startup script; scope of variables; writing comments
  • dump, load, source and related commands
  • dir, ls, getwd and setwd
  • package loading, or using the "::" notation
  • control flow: if, while, for
  • the explicit "print" function; the "cat" function
  • other useful functions: length, names, dimnames, unlist, cbind, rbind, c, as.<type>, is.<type>, order(vector), ...

Part II -- Data analytics with R

  • Structured data
  • Objects and attributes
  • lists, names(), dimnames(), factors
  • reading / writing (structured) data from/to files: read.table; read.csv; readLines, write.csv, ...
  • how to be memory-efficient with large volumes of data
  • data frames
  • how to use a database as "back store"
  • Packages
  • how to install a (third party) R package
  • examples: the "stats" package and the "ggplot2" package
  • other useful packages: foreign (for reading/writing data of SAS, SPSS, dBase, etc.); XML; AER; tm; vcd; DBI
  • Statistical techniques
  • Random Number Generators
  • sampling, summarizing: basic statistical terminology & techniques
  • examples from the "stats" package; the lm functions
  • plotting statistical graphs (scatter plots, histograms, trend lines, ...)
 

Intended for

Whoever wants to start practising data analysis in a "big data" context: developers, data architects, marketeers, and anyone who needs to manipulate, visualize, or summarize their corporate data. This course is also a first introduction to the R programming language, so anyone who wants to start using R or one of its many packages is welcome.

Background

This is a "beginners" course, so no technical background is required. Familiarity with the concepts of data stores and "big data" is of course advisable (see e.g. Big data concepts), as well as having some notions of statistics (cf Statistics fundamentals). Additionally, we expect that you are familiar with the concepts of a programming language (see e.g. Programming fundamentals).

Training method

Classroom instruction, focused on practical examples and supported by extensive exercises and individual practice.

Course leader

Peter Vanroose.

Duration

3 days.

Schedule

Vous pouvez vous inscrire en cliquant sur une date
dateduréelang.  lieu  prix
10 May3?Leuven  (BE)1425 EUR  (excl. TVA) 
23 Oct3NWoerden  (NL)1425 EUR  (exempte de TVA) 
15 Nov3?Leuven  (BE)1425 EUR  (excl. TVA)