Big Data Training Courses

Big Data Training

Big Data is a term that refers to solutions destined for storing and processing large data sets. Developed by Google initially, these Big Data solutions have evolved and inspired other similar projects, many of which are available as open-source. Some examples include Apache Hadoop, Cassandra and Cloudera Impala. According to Gartner’s reports, BigData is the next big step in IT just after the Cloud Computing and will be a leading trend in the next several years.

NobleProg onsite live BigData training courses start with an introduction to elemental concepts of Big Data, then progress into the programming languages and methodologies used to perform Data Analysis. Tools and infrastructure for enabling Big Data storage, Distributed Processing, and Scalability are discussed, compared and implemented in demo practice sessions.

BigData training is available in various formats, including onsite live training and live instructor-led training using an interactive, remote desktop setup. Local BigData training can be carried out live on customer premises or in NobleProg local training centers.

Client Testimonials

Big Data Course Outlines

Code Name Duration Overview
alluxio Alluxio: Unifying disparate storage systems 7 hours Alexio is an open-source virtual distributed storage system that unifies disparate storage systems and enables applications to interact with data at memory speed. It is used by companies such as Intel, Baidu and Alibaba. In this instructor-led, live training, participants will learn how to use Alexio to bridge different computation frameworks with storage systems and efficiently manage multi-petabyte scale data as they step through the creation of an application with Alluxio. By the end of this training, participants will be able to: Develop an application with Alluxio Connect big data systems and applications while preserving one namespace Efficiently extract value from big data in any storage format Improve workload performance Deploy and manage Alluxio standalone or clustered Audience Data scientist Developer System administrator Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
ApHadm1 Apache Hadoop: Manipulation and Transformation of Data Performance 21 hours This course is intended for developers, architects, data scientists or any profile that requires access to data either intensively or on a regular basis. The major focus of the course is data manipulation and transformation. Among the tools in the Hadoop ecosystem this course includes the use of Pig and Hive both of which are heavily used for data transformation and manipulation. This training also addresses performance metrics and performance optimisation. The course is entirely hands on and is punctuated by presentations of the theoretical aspects.
sparkdev Spark for Developers 21 hours OBJECTIVE: This course will introduce Apache Spark. The students will learn how  Spark fits  into the Big Data ecosystem, and how to use Spark for data analysis.  The course covers Spark shell for interactive data analysis, Spark internals, Spark APIs, Spark SQL, Spark streaming, and machine learning and graphX. AUDIENCE : Developers / Data Analysts
68736 Hadoop for Developers (2 days) 14 hours
hypertable Hypertable: Deploy a BigTable like database 14 hours Hypertable is an open-source software database management system based on the design of Google's Bigtable. In this instructor-led, live training, participants will learn how to set up and manage a Hypertable database system. By the end of this training, participants will be able to: Install, configure and upgrade a Hypertable instance Set up and administer a Hypertable cluster Monitor and optimize the performance of the database Design a Hypertable schema Work with Hypertable's API Troubleshoot operational issues Audience Developers Operations engineers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice Part lecture, part discussion, exercises and heavy hands-on practice
flink Flink for scalable stream and batch data processing 28 hours Apache Flink is an open-source framework for scalable stream and batch data processing. This instructor-led, live training introduces the principles and approaches behind distributed stream and batch data processing, and walks participants through the creation of a real-time, data streaming application. By the end of this training, participants will be able to: Set up an environment for developing data analysis applications Package, execute, and monitor Flink-based, fault-tolerant, data streaming applications Manage diverse workloads Perform advanced analytics using Flink ML Set up a multi-node Flink cluster Measure and optimize performance Integrate Flink with different Big Data systems Compare Flink capabilities with those of other big data processing frameworks Audience Developers Architects Data engineers Analytics professionals Technical managers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
voldemort Voldemort: Setting up a key-value distributed data store 14 hours Voldemort is an open-source distributed data store that is designed as a key-value store.  It is used at LinkedIn by numerous critical services powering a large portion of the site. This course will introduce the architecture and capabilities of Voldomort and walk participants through the setup and application of a key-value distributed data store. Audience     Software developers     System administrators     DevOps engineers Format of the course     Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding
solrdev Solr for Developers 21 hours This course introduces students to the Solr platform. Through a combination of lecture, discussion and labs students will gain hands on experience configuring effective search and indexing. The class begins with basic Solr installation and configuration then teaches the attendees the search features of Solr. Students will gain experience with faceting, indexing and search relevance among other features central to the Solr platform. The course wraps up with a number of advanced topics including spell checking, suggestions, Multicore and SolrCloud. Duration: 3 days Audience: Developers, business users, administrators
hadoopmapr Hadoop Administration on MapR 28 hours Audience: This course is intended to demystify big data/hadoop technology and to show it is not difficult to understand.
flockdb Flockdb: A Simple Graph Database for Social Media 7 hours FlockDB is an open source distributed, fault-tolerant graph database for managing wide but shallow network graphs. It was initially used by Twitter to store relationships among users. In this instructor-led, live training, participants will learn how to setup and use a FlockDB database to help answer social media questions such as who follows whom, who blocks whom, etc. By the end of this training, participants will be able to: Install and configure FlockDB Understand the unique features of FlockDB, relative to other graph databases such Neo4j Use FlockDB to maintain a large graph dataset Use FlockDB together with MySQL to provide provide distributed storage capabilities Query, create and update extremely fast graph edges Scale FlockDB horizontally for use in on-line, low-latency, high throughput web environments Audience Developers Database engineers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
samza Samza for stream processing 14 hours Apache Samza is an open-source near-realtime, asynchronous computational framework for stream processing.  It uses Apache Kafka for messaging, and Apache Hadoop YARN for fault tolerance, processor isolation, security, and resource management. This instructor-led, live training introduces the principles behind messaging systems and distributed stream processing, while walking participants through the creation of a sample Samza-based project and job execution. By the end of this training, participants will be able to: Use Samza to simplify the code needed to produce and consume messages Decouple the handling of messages from an application Use Samza to implement near-realtime asynchronous computation Use stream processing to provide a higher level of abstraction over messaging systems Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
druid Druid: Build a fast, real-time data analysis system 21 hours Druid is an open-source, column-oriented, distributed data store written in Java. It was designed to quickly ingest massive quantities of event data and execute low-latency OLAP queries on that data. Druid is commonly used in business intelligence applications to analyze high volumes of real-time and historical data. It is also well suited for powering fast, interactive, analytic dashboards for end-users. Druid is used by companies such as Alibaba, Airbnb, Cisco, eBay, Netflix, Paypal, and Yahoo. In this course we explore some of the limitations of data warehouse solutions and discuss how Druid can compliment those technologies to form a flexible and scalable streaming analytics stack. We walk through many examples, offering participants the chance to implement and test Druid-based solutions in a lab environment. Audience     Application developers     Software engineers     Technical consultants     DevOps professionals     Architecture engineers Format of the course     Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding
hadoopadm1 Hadoop For Administrators 21 hours Apache Hadoop is the most popular framework for processing Big Data on clusters of servers. In this three (optionally, four) days course, attendees will learn about the business benefits and use cases for Hadoop and its ecosystem, how to plan cluster deployment and growth, how to install, maintain, monitor, troubleshoot and optimize Hadoop. They will also practice cluster bulk data load, get familiar with various Hadoop distributions, and practice installing and managing Hadoop ecosystem tools. The course finishes off with discussion of securing cluster with Kerberos. “…The materials were very well prepared and covered thoroughly. The Lab was very helpful and well organized” — Andrew Nguyen, Principal Integration DW Engineer, Microsoft Online Advertising Audience Hadoop administrators Format Lectures and hands-on labs, approximate balance 60% lectures, 40% labs.
68780 Apache Spark 14 hours
TalendDI Talend Open Studio for Data Integration 28 hours Talend Open Studio for Data Integration is an open-source data integration product used to combine, convert and update data in various locations across a business. In this instructor-led, live training, participants will learn how to use the Talend ETL tool to carry out data transformation, data extraction, and connectivity with Hadoop, Hive, and Pig.   By the end of this training, participants will be able to Explain the concepts behind ETL (Extract, Transform, Load) and propagation Define ETL methods and ETL tools to connect with Hadoop Efficiently amass, retrieve, digest, consume, transform and shape big data in accordance to business requirements Audience Business intelligence professionals Project managers Database professionals SQL Developers ETL Developers Solution architects Data architects Data warehousing professionals System administrators and integrators Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
BDATR Big Data Analytics for Telecom Regulators 16 hours To meet compliance of the regulators, CSPs ( Communication service providers) can tap into Big Data Analytics which not only help them to meet compliance but within the scope of same project they can increase customer satisfaction and thus reduce the churn. In fact since compliance is related to Quality of service tied to a contract, any initiative towards meeting the compliance, will improve the “competitive edge” of the CSPs. Therefore, it is important that Regulators should be able to advise/guide a set of Big Data analytic practice for CSPs that will be of mutual benefit between the regulators and CSPs. 2 days of course : 8 modules, 2 hours each = 16 hours
zeppelin Zeppelin for interactive data analytics 14 hours Apache Zeppelin is a web-based notebook for capturing, exploring, visualizing and sharing Hadoop and Spark based data. This instructor-led, live training introduces the concepts behind interactive data analytics and walks participants through the deployment and usage of Zeppelin in a single-user or multi-user environment. By the end of this training, participants will be able to: Install and configure Zeppelin Develop, organize, execute and share data in a browser-based interface Visualize results without referring to the command line or cluster details Execute and collaborate on long workflows Work with any of a number of plug-in language/data-processing-backends, such as Scala ( with Apache Spark ), Python ( with Apache Spark ), Spark SQL, JDBC, Markdown and Shell. Integrate Zeppelin with Spark, Flink and Map Reduce Secure multi-user instances of Zeppelin with Apache Shiro Audience Data engineers Data analysts Data scientists Software developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
accumulo Apache Accumulo: Building highly scalable big data applications 21 hours Apache Accumulo is a sorted, distributed key/value store that provides robust, scalable data storage and retrieval. It is based on the design of Google's BigTable and is powered by Apache Hadoop, Apache Zookeeper, and Apache Thrift.   This courses covers the working principles behind Accumulo and walks participants through the development of a sample application on Apache Accumulo. Audience Application developers Software engineers Technical consultants Format of the course Part lecture, part discussion, hands-on development and implementation, occasional tests to gauge understanding
dataar Data Analytics With R 21 hours R is a very popular, open source environment for statistical computing, data analytics and graphics. This course introduces R programming language to students.  It covers language fundamentals, libraries and advanced concepts.  Advanced data analytics and graphing with real world data. Audience Developers / data analytics Duration 3 days Format Lectures and Hands-on
psr Introduction to Recommendation Systems 7 hours Audience Marketing department employees, IT strategists and other people involved in decisions related to the design and implementation of recommender systems. Format Short theoretical background follow by analysing working examples and short, simple exercises.
cpb100 Google Cloud Platform Fundamentals: Big Data & Machine Learning 8 hours This one-day instructor-led course introduces participants to the big data capabilities of Google Cloud Platform. Through a combination of presentations, demos, and hands-on labs, participants get an overview of the Google Cloud platform and a detailed view of the data processing and machine learning capabilities. This course showcases the ease, flexibility, and power of big data solutions on Google Cloud Platform. This course teaches participants the following skills: Identify the purpose and value of the key Big Data and Machine Learning products in the Google Cloud Platform. Use Cloud SQL and Cloud Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud Platform. Employ BigQuery and Cloud Datalab to carry out interactive data analysis. Train and use a neural network using TensorFlow. Employ ML APIs. Choose between different data processing products on the Google Cloud Platform. This class is intended for the following: Data analysts, Data scientists, Business analysts getting started with Google Cloud Platform. Individuals responsible for designing pipelines and architectures for data processing, creating and maintaining machine learning and statistical models, querying datasets, visualizing query results and creating reports. Executives and IT decision makers evaluating Google Cloud Platform for use by data scientists.
bigdatabicriminal Big Data Business Intelligence for Criminal Intelligence Analysis 35 hours Advances in technologies and the increasing amount of information are transforming how law enforcement is conducted. The challenges that Big Data pose are nearly as daunting as Big Data's promise. Storing data efficiently is one of these challenges; effectively analyzing it is another. In this instructor-led, live training, participants will learn the mindset with which to approach Big Data technologies, assess their impact on existing processes and policies, and implement these technologies for the purpose of identifying criminal activity and preventing crime. Case studies from law enforcement organizations around the world will be examined to gain insights on their adoption approaches, challenges and results. By the end of this training, participants will be able to: Combine Big Data technology with traditional data gathering processes to piece together a story during an investigation Implement industrial big data storage and processing solutions for data analysis Prepare a proposal for the adoption of the most adequate tools and processes for enabling a data-driven approach to criminal investigation Audience Law Enforcement specialists with a technical background Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
magellan Magellan: Geospatial Analytics with on Spark 14 hours Magellan is an open-source distributed execution engine for geospatial analytics on big data. Implemented on top of Apache Spark, it extends Spark SQL and provides a relational abstraction for geospatial analytics. This instructor-led, live training introduces the concepts and approaches for implementing geospacial analytics and walks participants through the creation of a predictive analysis application using Magellan on Spark. By the end of this training, participants will be able to: Efficiently query, parse and join geospatial datasets at scale Implement geospatial data in business intelligence and predictive analytics applications Use spatial context to extend the capabilities of mobile devices, sensors, logs, and wearables Audience Application developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
scylladb Scylla database 21 hours Scylla is an open-source distributed NoSQL data store. It is compatible with Apache Cassandra but performs at significantly higher throughputs and lower latencies. In this course, participants will learn about Scylla's features and architecture while obtaining practical experience with setting up, administering, monitoring, and troubleshooting Scylla.   Audience     Database administrators     Developers     System Engineers Format of the course     The course is interactive and includes discussions of the principles and approaches for deploying and managing Scylla distributed databases and clusters. The course includes a heavy component of hands-on exercises and practice.
hadoopba Hadoop for Business Analysts 21 hours Apache Hadoop is the most popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads in to tradional BI analytics world. This course will introduce an analyst to the core components of Hadoop eco system and its analytics Audience Business Analysts Duration three days Format Lectures and hands on labs.
bigdatar Programming with Big Data in R 21 hours
graphcomputing Introduction to Graph Computing 28 hours A large number of real world problems can be described in terms of graphs. For example, the Web graph, the social network graph, the train network graph and the language graph. These graphs tend to be extremely large; processing them requires a specialized set of tools and mindset referred to as graph computing. In this instructor-led, live training, participants will learn about the various technology offerings and implementations for processing graph data. The aim is to identify real-world objects, their characteristics and relationships, then model these relationships and process them as data using graph computing approaches. We start with a broad overview and narrow in on specific tools as we step through a series of case studies, hands-on exercises and live deployments. By the end of this training, participants will be able to: Understand how graph data is persisted and traversed Select the best framework for a given task (from graph databases to batch processing frameworks) Implement Hadoop, Spark, GraphX and Pregel to carry out graph computing across many machines in parallel View real-world big data problems in terms of graphs, processes and traversals Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
hdp Hortonworks Data Platform (HDP) for administrators 21 hours Hortonworks Data Platform is an open-source Apache Hadoop support platform that provides a stable foundation for developing big data solutions on the Apache Hadoop ecosystem. This instructor-led live training introduces Hortonworks and walks participants through the deployment of Spark + Hadoop solution. By the end of this training, participants will be able to: Use Hortonworks to reliably run Hadoop at a large scale Unify Hadoop's security, governance, and operations capabilities with Spark's agile analytic workflows. Use Hortonworks to investigate, validate, certify and support each of the components in a Spark project Process different types of data, including structured, unstructured, in-motion, and at-rest. Audience Hadoop administrators Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
kdd Knowledge Discover in Databases (KDD) 21 hours Knowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. Real-life applications for this data mining technique include marketing, fraud detection, telecommunication and manufacturing. In this course, we introduce the processes involved in KDD and carry out a series of exercises to practice the implementation of those processes. Audience     Data analysts or anyone interested in learning how to interpret data to solve problems Format of the course     After a theoretical discussion of KDD, the instructor will present real-life cases which call for the application of KDD to solve a problem. Participants will prepare, select and cleanse sample data sets and use their prior knowledge about the data to propose solutions based on the results of their observations.
cassdev Cassandra for Developers 21 hours This course will introduce Cassandra –  a popular NoSQL database.  It will cover Cassandra principles, architecture and data model.   Students will learn data modeling  in CQL (Cassandra Query Language) in hands-on, interactive labs.  This session also discusses Cassandra internals and some admin topics. Audience : Developers
d2dbdpa From Data to Decision with Big Data and Predictive Analytics 21 hours Audience If you try to make sense out of the data you have access to or want to analyse unstructured data available on the net (like Twitter, Linked in, etc...) this course is for you. It is mostly aimed at decision makers and people who need to choose what data is worth collecting and what is worth analyzing. It is not aimed at people configuring the solution, those people will benefit from the big picture though. Delivery Mode During the course delegates will be presented with working examples of mostly open source technologies. Short lectures will be followed by presentation and simple exercises by the participants Content and Software used All software used is updated each time the course is run so we check the newest versions possible. It covers the process from obtaining, formatting, processing and analysing the data, to explain how to automate decision making process with machine learning.
matlabpredanalytics Matlab for Predictive Analytics 21 hours Predictive analytics is the process of using data analytics to make predictions about the future. This process uses data along with data mining, statistics, and machine learning techniques to create a predictive model for forecasting future events. In this instructor-led, live training, participants will learn how to use Matlab to build predictive models and apply them to large sample data sets to predict future events based on the data. By the end of this training, participants will be able to: Create predictive models to analyze patterns in historical and transactional data Use predictive modeling to identify risks and opportunities Build mathematical models that capture important trends Use data to from devices and business systems to reduce waste, save time, or cut costs Audience Developers Engineers Domain experts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
PentahoDI Pentaho Data Integration Fundamentals 21 hours Pentaho Data Integration is an open-source data integration tool for defining jobs and data transformations. In this instructor-led, live training, participants will learn how to use Pentaho Data Integration's powerful ETL capabilities and rich GUI to manage an entire big data lifecycle, maximizing the value of data to the organization. By the end of this training, participants will be able to: Create, preview, and run basic data transformations containing steps and hops Configure and secure the Pentaho Enterprise Repository Harness disparate sources of data and generate a single, unified version of the truth in an analytics-ready format. Provide results to third-part applications for further processing Audience Data Analyst ETL developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
neo4j Beyond the relational database: neo4j 21 hours Relational, table-based databases such as Oracle and MySQL have long been the standard for organizing and storing data. However, the growing size and fluidity of data have made it difficult for these traditional systems to efficiently execute highly complex queries on the data. Imagine replacing rows-and-columns-based data storage with object-based data storage, whereby entities (e.g., a person) could be stored as data nodes, then easily queried on the basis of their vast, multi-linear relationship with other nodes. And imagine querying these connections and their associated objects and properties using a compact syntax, up to 20 times lighter than SQL. This is what graph databases, such as neo4j offer. In this hands-on course, we will set up a live project and put into practice the skills to model, manage and access your data. We contrast and compare graph databases with SQL-based databases as well as other NoSQL databases and clarify when and where it makes sense to implement each within your infrastructure. Audience Database administrators (DBAs) Data analysts Developers System Administrators DevOps engineers Business Analysts CTOs CIOs Format of the course Heavy emphasis on hands-on practice. Most of the concepts are learned through samples, exercises and hands-on development.
hadoopdev Hadoop for Developers (4 days) 28 hours Apache Hadoop is the most popular framework for processing Big Data on clusters of servers. This course will introduce a developer to various components (HDFS, MapReduce, Pig, Hive and HBase) Hadoop ecosystem.  
iotemi IoT (Internet of Things) for Entrepreneurs, Managers and Investors 21 hours Unlike other technologies, IoT is far more complex encompassing almost every branch of core Engineering-Mechanical, Electronics, Firmware, Middleware, Cloud, Analytics and Mobile. For each of its engineering layers, there are aspects of economics, standards, regulations and evolving state of the art. This is for the firs time, a modest course is offered to cover all of these critical aspects of IoT Engineering. Summary An advanced training program covering the current state of the art in Internet of Things Cuts across multiple technology domains to develop awareness of an IoT system and its components and how it can help businesses and organizations. Live demo of model IoT applications to showcase practical IoT deployments across different industry domains, such as Industrial IoT, Smart Cities, Retail, Travel & Transportation and use cases around connected devices & things Target Audience Managers responsible for business and operational processes within their respective organizations and want to know how to harness IoT to make their systems and processes more efficient. Entrepreneurs and Investors who are looking to build new ventures and want to develop a better understanding of the IoT technology landscape to see how they can leverage it in an effective manner. Estimates for Internet of Things or IoT market value are massive, since by definition the IoT is an integrated and diffused layer of devices, sensors, and computing power that overlays entire consumer, business-to-business, and government industries. The IoT will account for an increasingly huge number of connections: 1.9 billion devices today, and 9 billion by 2018. That year, it will be roughly equal to the number of smartphones, smart TVs, tablets, wearable computers, and PCs combined. In the consumer space, many products and services have already crossed over into the IoT, including kitchen and home appliances, parking, RFID, lighting and heating products, and a number of applications in Industrial Internet. However, the underlying technologies of IoT are nothing new as M2M communication existed since the birth of Internet. However what changed in last couple of years is the emergence of number of inexpensive wireless technologies added by overwhelming adaptation of smart phones and Tablet in every home. Explosive growth of mobile devices led to present demand of IoT. Due to unbounded opportunities in IoT business, a large number of small and medium sized entrepreneurs jumped on a bandwagon of IoT gold rush. Also due to emergence of open source electronics and IoT platform, cost of development of IoT system and further managing its sizable production is increasingly affordable. Existing electronic product owners are experiencing pressure to integrate their device with Internet or Mobile app. This training is intended for a technology and business review of an emerging industry so that IoT enthusiasts/entrepreneurs can grasp the basics of IoT technology and business. Course Objective Main objective of the course is to introduce emerging technological options, platforms and case studies of IoT implementation in home & city automation (smart homes and cities), Industrial Internet, healthcare, Govt., Mobile Cellular and other areas. Basic introduction of all the elements of IoT-Mechanical, Electronics/sensor platform, Wireless and wireline protocols, Mobile to Electronics integration, Mobile to enterprise integration, Data-analytics and Total control plane M2M Wireless protocols for IoT- WiFi, Zigbee/Zwave, Bluetooth, ANT+ : When and where to use which one? Mobile/Desktop/Web app- for registration, data acquisition and control –Available M2M data acquisition platform for IoT-–Xively, Omega and NovoTech, etc. Security issues and security solutions for IoT Open source/commercial electronics platform for IoT-Raspberry Pi, Arduino , ArmMbedLPC etc Open source /commercial enterprise cloud platform for AWS-IoT apps, Azure -IOT, Watson-IOT cloud in addition to other minor IoT clouds Studies of business and technology of some of the common IoT devices like Home automation, Smoke alarm, vehicles, military, home health etc.
nifidev Apache NiFi for Developers 7 hours Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a web-based user interface to manage dataflows in real time. In this instructor-led, live training, participants will learn the fundamentals of flow-based programming as they develop a number of demo extensions, components and processors using Apache NiFi. By the end of this training, participants will be able to: Understand NiFi's architecture and dataflow concepts Develop extensions using NiFi and third-party APIs Custom develop their own Apache Nifi processor Ingest and process real-time data from disparate and uncommon file formats and data sources Audience Developers Data engineers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
pythonmultipurpose Advanced Python 28 hours In this instructor-led training, participants will learn advanced Python programming techniques, including how to apply this versatile language to solve problems in areas such as distributed applications, finance, data analysis and visualization, UI programming and maintenance scripting. Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice Notes If you wish to add, remove or customize any section or topic within this course, please contact us to arrange.
IntroToAvro Apache Avro: Data serialization for distributed applications 14 hours This course is intended for Developers Format of the course Lectures, hands-on practice, small tests along the way to gauge understanding
hadoopdeva Advanced Hadoop for Developers 21 hours Apache Hadoop is one of the most popular frameworks for processing Big Data on clusters of servers. This course delves into data management in HDFS, advanced Pig, Hive, and HBase.  These advanced programming techniques will be beneficial to experienced Hadoop developers. Audience: developers Duration: three days Format: lectures (50%) and hands-on labs (50%).  
bdbitcsp Big Data Business Intelligence for Telecom and Communication Service Providers 35 hours Overview Communications service providers (CSP) are facing pressure to reduce costs and maximize average revenue per user (ARPU), while ensuring an excellent customer experience, but data volumes keep growing. Global mobile data traffic will grow at a compound annual growth rate (CAGR) of 78 percent to 2016, reaching 10.8 exabytes per month. Meanwhile, CSPs are generating large volumes of data, including call detail records (CDR), network data and customer data. Companies that fully exploit this data gain a competitive edge. According to a recent survey by The Economist Intelligence Unit, companies that use data-directed decision-making enjoy a 5-6% boost in productivity. Yet 53% of companies leverage only half of their valuable data, and one-fourth of respondents noted that vast quantities of useful data go untapped. The data volumes are so high that manual analysis is impossible, and most legacy software systems can’t keep up, resulting in valuable data being discarded or ignored. With Big Data & Analytics’ high-speed, scalable big data software, CSPs can mine all their data for better decision making in less time. Different Big Data products and techniques provide an end-to-end software platform for collecting, preparing, analyzing and presenting insights from big data. Application areas include network performance monitoring, fraud detection, customer churn detection and credit risk analysis. Big Data & Analytics products scale to handle terabytes of data but implementation of such tools need new kind of cloud based database system like Hadoop or massive scale parallel computing processor ( KPU etc.) This course work on Big Data BI for Telco covers all the emerging new areas in which CSPs are investing for productivity gain and opening up new business revenue stream. The course will provide a complete 360 degree over view of Big Data BI in Telco so that decision makers and managers can have a very wide and comprehensive overview of possibilities of Big Data BI in Telco for productivity and revenue gain. Course objectives Main objective of the course is to introduce new Big Data business intelligence techniques in 4 sectors of Telecom Business (Marketing/Sales, Network Operation, Financial operation and Customer Relation Management). Students will be introduced to following: Introduction to Big Data-what is 4Vs (volume, velocity, variety and veracity) in Big Data- Generation, extraction and management from Telco perspective How Big Data analytic differs from legacy data analytic In-house justification of Big Data -Telco perspective Introduction to Hadoop Ecosystem- familiarity with all Hadoop tools like Hive, Pig, SPARC –when and how they are used to solve Big Data problem How Big Data is extracted to analyze for analytics tool-how Business Analysis’s can reduce their pain points of collection and analysis of data through integrated Hadoop dashboard approach Basic introduction of Insight analytics, visualization analytics and predictive analytics for Telco Customer Churn analytic and Big Data-how Big Data analytic can reduce customer churn and customer dissatisfaction in Telco-case studies Network failure and service failure analytics from Network meta-data and IPDR Financial analysis-fraud, wastage and ROI estimation from sales and operational data Customer acquisition problem-Target marketing, customer segmentation and cross-sale from sales data Introduction and summary of all Big Data analytic products and where they fit into Telco analytic space Conclusion-how to take step-by-step approach to introduce Big Data Business Intelligence in your organization Target Audience Network operation, Financial Managers, CRM managers and top IT managers in Telco CIO office. Business Analysts in Telco CFO office managers/analysts Operational managers QA managers
nifi Apache NiFi for Administrators 21 hours Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a web-based user interface to manage dataflows in real time. In this instructor-led, live training, participants will learn how to deploy and manage Apache NiFi in a live lab environment. By the end of this training, participants will be able to: Install and configure Apachi NiFi Source, transform and manage data from disparate, distributed data sources, including databases and big data lakes Automate dataflows Enable streaming analytics Apply various approaches for data ingestion Transform Big Data and into business insights Audience System administrators Data engineers Developers DevOps Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
matlabfundamentalsfinance MATLAB Fundamentals + MATLAB for Finance 35 hours This course provides a comprehensive introduction to the MATLAB technical computing environment + an introduction to using MATLAB for financial applications. The course is intended for beginning users and those looking for a review. No prior programming experience or knowledge of MATLAB is assumed. Themes of data analysis, visualization, modeling, and programming are explored throughout the course. Topics include: Working with the MATLAB user interface Entering commands and creating variables Analyzing vectors and matrices Visualizing vector and matrix data Working with data files Working with data types Automating commands with scripts Writing programs with logic and flow control Writing functions Using the Financial Toolbox for quantitative analysis
dsbda Data Science for Big Data Analytics 35 hours Big data is data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating and information privacy.
bigdatastore Big Data Storage Solution - NoSQL 14 hours When traditional storage technologies don't handle the amount of data you need to store there are hundereds of alternatives. This course try to guide the participants what are alternatives for storing and analyzing Big Data and what are theirs pros and cons. This course is mostly focused on discussion and presentation of solutions, though hands-on exercises are available on demand.
bdbiga Big Data Business Intelligence for Govt. Agencies 35 hours Advances in technologies and the increasing amount of information are transforming how business is conducted in many industries, including government. Government data generation and digital archiving rates are on the rise due to the rapid growth of mobile devices and applications, smart sensors and devices, cloud computing solutions, and citizen-facing portals. As digital information expands and becomes more complex, information management, processing, storage, security, and disposition become more complex as well. New capture, search, discovery, and analysis tools are helping organizations gain insights from their unstructured data. The government market is at a tipping point, realizing that information is a strategic asset, and government needs to protect, leverage, and analyze both structured and unstructured information to better serve and meet mission requirements. As government leaders strive to evolve data-driven organizations to successfully accomplish mission, they are laying the groundwork to correlate dependencies across events, people, processes, and information. High-value government solutions will be created from a mashup of the most disruptive technologies: Mobile devices and applications Cloud services Social business technologies and networking Big Data and analytics IDC predicts that by 2020, the IT industry will reach $5 trillion, approximately $1.7 trillion larger than today, and that 80% of the industry's growth will be driven by these 3rd Platform technologies. In the long term, these technologies will be key tools for dealing with the complexity of increased digital information. Big Data is one of the intelligent industry solutions and allows government to make better decisions by taking action based on patterns revealed by analyzing large volumes of data — related and unrelated, structured and unstructured. But accomplishing these feats takes far more than simply accumulating massive quantities of data.“Making sense of thesevolumes of Big Datarequires cutting-edge tools and technologies that can analyze and extract useful knowledge from vast and diverse streams of information,” Tom Kalil and Fen Zhao of the White House Office of Science and Technology Policy wrote in a post on the OSTP Blog. The White House took a step toward helping agencies find these technologies when it established the National Big Data Research and Development Initiative in 2012. The initiative included more than $200 million to make the most of the explosion of Big Data and the tools needed to analyze it. The challenges that Big Data poses are nearly as daunting as its promise is encouraging. Storing data efficiently is one of these challenges. As always, budgets are tight, so agencies must minimize the per-megabyte price of storage and keep the data within easy access so that users can get it when they want it and how they need it. Backing up massive quantities of data heightens the challenge. Analyzing the data effectively is another major challenge. Many agencies employ commercial tools that enable them to sift through the mountains of data, spotting trends that can help them operate more efficiently. (A recent study by MeriTalk found that federal IT executives think Big Data could help agencies save more than $500 billion while also fulfilling mission objectives.). Custom-developed Big Data tools also are allowing agencies to address the need to analyze their data. For example, the Oak Ridge National Laboratory’s Computational Data Analytics Group has made its Piranha data analytics system available to other agencies. The system has helped medical researchers find a link that can alert doctors to aortic aneurysms before they strike. It’s also used for more mundane tasks, such as sifting through résumés to connect job candidates with hiring managers.
kdbplusandq kdb+ and q: Analyze time series data 21 hours kdb+ is an in-memory, column-oriented database and q is its built-in, interpreted vector-based language. In kdb+, tables are columns of vectors and q is used to perform operations on the table data as if it was a list. kdb+ and q are commonly used in high frequency trading and are popular with the major financial institutions, including Goldman Sachs, Morgan Stanley, Merrill Lynch, JP Morgan, etc. In this instructor-led, live training, participants will learn how to create a time series data application using kdb+ and q. By the end of this training, participants will be able to: Understand the difference between a row-oriented database and a column-oriented database Select data, write scripts and create functions to carry out advanced analytics Analyze time series data such as stock and commodity exchange data Use kdb+'s in-memory capabilities to store, analyze, process and retrieve large data sets at high speed Think of functions and data at a higher level than the standard function(arguments) approach common in non-vector languages Explore other time-sensitive applications for kdb+, including energy trading, telecommunications, sensor data, log data, and machine and network usage monitoring Audience Developers Database engineers Data scientists Data analysts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
kylin Apache Kylin: From classic OLAP to real-time data warehouse 14 hours Apache Kylin is an extreme, distributed analytics engine for big data. In this instructor-led live training, participants will learn how to use Apache Kylin to set up a real-time data warehouse. By the end of this training, participants will be able to: Consume real-time streaming data using Kylin Utilize Apache Kylin's powerful features, including snowflake schema support, a rich SQL interface, spark cubing and subsecond query latency Note We use the latest version of Kylin (as of this writing, Apache Kylin v2.0) Audience Big data engineers Big Data analysts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
cassadmin Cassandra Administration 14 hours This course will introduce Cassandra –  a popular NoSQL database.  It will cover Cassandra principles, architecture and data model.   Students will learn data modeling  in CQL (Cassandra Query Language) in hands-on, interactive labs.  This session also discusses Cassandra internals and some admin topics.
rintrob Introductory R for Biologists 28 hours R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has also found followers among statisticians, engineers and scientists without computer programming skills who find it easy to use. Its popularity is due to the increasing use of data mining for various goals such as set ad prices, find new drugs more quickly or fine-tune financial models. R has a wide variety of packages for data mining.
rneuralnet Neural Network in R 14 hours This course is an introduction to applying neural networks in real world problems using R-project software.
deckgl deck.gl: Visualizing Large-scale Geospatial Data 14 hours deck.gl is an open-source, WebGL-powered library for exploring and visualizing data assets at scale. Created by Uber, it is especially useful for gaining insights from geospatial data sources, such as data on maps. This instructor-led, live training introduces the concepts and functionality behind deck.gl and walks participants through the set up of a demonstration project. By the end of this training, participants will be able to: Take data from very large collections and turn it into compelling visual representations Visualize data collected from transportation and journey-related use cases, such as pick-up and drop-off experiences, network traffic, etc. Apply layering techniques to geospatial data to depict changes in data over time Integrate deck.gl with React (for Reactive programming) and Mapbox GL (for visualizations on Mapbox based maps). Understand and explore other use cases for deck.gl, including visualizing points collected from a 3D indoor scan, visualizing machine learning models in order to optimize their algorithms, etc. Audience Developers Data scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
glusterfs GlusterFS for System Administrators 21 hours GlusterFS is an open-source distributed file storage system that can scale up to petabytes of capacity. GlusterFS is designed to provide additional space depending on the user's storage requirements. A common application for GlusterFS is cloud computing storage systems. In this instructor-led training, participants will learn how to use normal, off-the-shelf hardware to create and deploy a storage system that is scalable and always available.  By the end of the course, participants will be able to: Install, configure, and maintain a full-scale GlusterFS system. Implement large-scale storage systems in different types of environments. Audience System administrators Storage administrators Format of the Course Part lecture, part discussion, exercises and heavy hands-on practice.
altdomexp Analytics Domain Expertise 7 hours This course is part of the Data Scientist skill set (Domain: Analytics Domain Expertise).
matlab2 MATLAB Fundamentals 21 hours This three-day course provides a comprehensive introduction to the MATLAB technical computing environment. The course is intended for beginning users and those looking for a review. No prior programming experience or knowledge of MATLAB is assumed. Themes of data analysis, visualization, modeling, and programming are explored throughout the course. Topics include:     Working with the MATLAB user interface     Entering commands and creating variables     Analyzing vectors and matrices     Visualizing vector and matrix data     Working with data files     Working with data types     Automating commands with scripts     Writing programs with logic and flow control     Writing functions
mdlmrah Model MapReduce and Apache Hadoop 14 hours The course is intended for IT specialist that works with the distributed processing of large data sets across clusters of computers.
datavault Data Vault: Building a Scalable Data Warehouse 28 hours Data vault modeling is a database modeling technique that provides long-term historical storage of data that originates from multiple sources. A data vault stores a single version of the facts, or "all the data, all of the time". Its flexible, scalable, consistent and adaptable design encompasses the best aspects of 3rd normal form (3NF) and star schema. In this instructor-led, live training, participants will learn how to build a Data Vault. By the end of this training, participants will be able to: Understand the architecture and design concepts behind Data Vault 2.0, and its interaction with Big Data, NoSQL and AI. Use data vaulting techniques to enable auditing, tracing, and inspection of historical data in a data warehouse Develop a consistent and repeatable ETL (Extract, Transform, Load) process Build and deploy highly scalable and repeatable warehouses Audience Data modelers Data warehousing specialist Business Intelligence specialists Data engineers Database administrators Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
storm Apache Storm 28 hours Apache Storm is a distributed, real-time computation engine used for enabling real-time business intelligence. It does so by enabling applications to reliably process unbounded streams of data (a.k.a. stream processing). "Storm is for real-time processing what Hadoop is for batch processing!" In this instructor-led live training, participants will learn how to install and configure Apache Storm, then develop and deploy an Apache Storm application for processing big data in real-time. Some of the topics included in this training include: Apache Storm in the context of Hadoop Working with unbounded data Continuous computation Real-time analytics Distributed RPC and ETL processing Request this course now! Audience Software and ETL developers Mainframe professionals Data scientists Big data analysts Hadoop professionals Format of the course     Part lecture, part discussion, exercises and heavy hands-on practice
dmmlr Data Mining & Machine Learning with R 14 hours R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data mining.
datashrinkgov Data Shrinkage for Government 14 hours
smtwebint Semantic Web Overview 7 hours The Semantic Web is a collaborative movement led by the World Wide Web Consortium (W3C) that promotes common formats for data on the World Wide Web. The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries.
datameer Datameer for Data Analysts 14 hours Datameer is a business intelligence and analytics platform built on Hadoop. It allows end-users to access, explore and correlate large-scale, structured, semi-structured and unstructured data in an easy-to-use fashion. In this instructor-led, live training, participants will learn how to use Datameer to overcome Hadoop's steep learning curve as they step through the setup and analysis of a series of big data sources. By the end of this training, participants will be able to: Create, curate, and interactively explore an enterprise data lake Access business intelligence data warehouses, transactional databases and other analytic stores Use a spreadsheet user-interface to design end-to-end data processing pipelines Access pre-built functions to explore complex data relationships Use drag-and-drop wizards to visualize data and create dashboards Use tables, charts, graphs, and maps to analyze query results Audience Data analysts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
hadoopforprojectmgrs Hadoop for Project Managers 14 hours As more and more software and IT projects migrate from local processing and data management to distributed processing and big data storage, Project Managers are finding the need to upgrade their knowledge and skills to grasp the concepts and practices relevant to Big Data projects and opportunities. This course introduces Project Managers to the most popular Big Data processing framework: Hadoop.   In this instructor-led training, participants will learn the core components of the Hadoop ecosystem and how these technologies can be used to solve large-scale problems. In learning these foundations, participants will also improve their ability to communicate with the developers and implementers of these systems as well as the data scientists and analysts that many IT projects involve. Audience Project Managers wishing to implement Hadoop into their existing development or IT infrastructure Project Managers needing to communicate with cross-functional teams that include big data engineers, data scientists and business analysts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
rprogda R Programming for Data Analysis 14 hours This course is part of the Data Scientist skill set (Domain: Data and Technology)
hadoopadm Hadoop Administration 21 hours The course is dedicated to IT specialists that are looking for a solution to store and process large data sets in distributed system environment Course goal: Getting knowledge regarding Hadoop cluster administration
sparkpython Python and Spark for Big Data (PySpark) 21 hours Python is a high-level programming language famous for its clear syntax and code readibility. Spark is a data processing engine used in querying, analyzing, and transforming big data. PySpark allows users to interface Spark with Python. In this instructor-led, live training, participants will learn how to use Python and Spark together to analyze big data as they work on hands-on exercises. By the end of this training, participants will be able to: Learn how to use Spark with Python to analyze Big Data Work on exercises that mimic real world circumstances Use different tools and techniques for big data analysis using PySpark Audience Developers IT Professionals Data Scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
ApacheIgnite Apache Ignite: Improve speed, scale and availability with in-memory computing 14 hours Apache Ignite is an in-memory computing platform that sits between the application and data layer to improve speed, scale and availability. In this instructor-led, live training, participants will learn the principles behind persistent and pure in-memory storage as they step through the creation of a sample in-memory computing project. By the end of this training, participants will be able to: Use Ignite for in-memory, on-disk persistence as well as a purely distributed in-memory database Achieve persistence without syncing data back to a relational database Use Ignite to carry out SQL and distributed joins Improve performance by moving data closer to the CPU, using RAM as a storage Spread data sets across a cluster to achieve horizontal scalability Integrate Ignite with RDBMS, NoSQL, Hadoop and machine learning processors Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
cassdev1 Cassandra for Developers - Bespoke 21 hours This course will introduce Cassandra –  a popular NoSQL database.  It will cover Cassandra principles, architecture and data model.   Students will learn data modeling  in CQL (Cassandra Query Language) in hands-on, interactive labs.  This session also discusses Cassandra internals and some admin topics. Duration : 3 days Audience : Developers
bigddbsysfun Big Data & Database Systems Fundamentals 14 hours The course is part of the Data Scientist skill set (Domain: Data and Technology).
apacheh Administrator Training for Apache Hadoop 35 hours Audience: The course is intended for IT specialists looking for a solution to store and process large data sets in a distributed system environment Goal: Deep knowledge on Hadoop cluster administration.
apachedrill Apache Drill for On-the-Fly Analysis of Multiple Big Data Formats 21 hours Apache Drill is a schema-free, distributed, in-memory columnar SQL query engine for Hadoop, NoSQL and and other Cloud and file storage systems. Apache Drill's power lies in its ability to join data from multiple data stores using a single query. Apache Drill supports numerous NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. In this instructor-led, live training, participants will learn the fundamentals of Apache Drill, then leverage the power and convenience of SQL to interactively query big data without writing code. Participants will also learn how to optimize their Drill queries for distributed SQL execution. By the end of this training, participants will be able to: Perform "self-service" exploration on structured and semi-structured data on Hadoop Query known as well as unknown data using SQL queries Understand how Apache Drills receives and executes queries Write SQL queries to analyze different types of data, including structured data in Hive, semi-structured data in HBase or MapR-DB tables, and data saved in files such as Parquet and JSON. Use Apache Drill to perform on-the-fly schema discovery, bypassing the need for complex ETL and schema operations Integrate Apache Drill with BI (Business Intelligence) tools such as Tableau, Qlikview, MicroStrategy and Excel Audience Data analysts Data scientists SQL programmers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
vespa Vespa: Serving large-scale data in real-time 14 hours Vespa an open-source big data processing and serving engine created by Yahoo.  It is used to respond to user queries, make recommendations, and provide personalized content and advertisements in real-time. This instructor-led, live training introduces the challenges of serving large-scale data and walks participants through the creation of an application that can compute responses to user requests, over large datasets in real-time. By the end of this training, participants will be able to: Use Vespa to quickly compute data (store, search, rank, organize) at serving time while a user waits Implement Vespa into existing applications involving feature search, recommendations, and personalization Integrate and deploy Vespa with existing big data systems such as Hadoop and Storm. Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
teraintro Teradata Fundamentals 21 hours Teradata is one of the popular Relational Database Management System. It is mainly suitable for building large scale data warehousing applications. Teradata achieves this by the concept of parallelism.  This course introduces the delegates to Teradata
osovv OpenStack Overview 7 hours The course is dedicated to IT engineers and architects who are looking for a solution to host private or public IaaS (Infrastructure as a Service) cloud. This is also great opportunity for IT managers to gain knowledge overview about possibilities which could be enabled by OpenStack. Before You spend a lot of money on OpenStack implementation, You could consider all pros and cons by attending on our course. This topic is also avaliable as individual consultancy. Course goal: gaining basic knowledge regarding OpenStack
datamin Data Mining 21 hours Course can be provided with any tools, including free open-source data mining software and applications
BigData_ A practical introduction to Data Analysis and Big Data 35 hours Participants who complete this training will gain a practical, real-world understanding of Big Data and its related technologies, methodologies and tools. Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class. The course starts with an introduction to elemental concepts of Big Data, then progresses into the programming languages and methodologies used to perform Data Analysis. Finally, we discuss the tools and infrastructure that enable Big Data storage, Distributed Processing, and Scalability. Audience Developers / programmers IT consultants Format of the course Part lecture, part discussion, hands-on practice and implementation, occasional quizing to measure progress.
apex Apache Apex: Processing big data-in-motion 21 hours Apache Apex is a YARN-native platform that unifies stream and batch processing. It processes big data-in-motion in a way that is scalable, performant, fault-tolerant, stateful, secure, distributed, and easily operable. This instructor-led, live training introduces Apache Apex's unified stream processing architecture and walks participants through the creation of a distributed application using Apex on Hadoop. By the end of this training, participants will be able to: Understand data processing pipeline concepts such as connectors for sources and sinks, common data transformations, etc. Build, scale and optimize an Apex application Process real-time data streams reliably and with minimum latency Use Apex Core and the Apex Malhar library to enable rapid application development Use the Apex API to write and re-use existing Java code Integrate Apex into other applications as a processing engine Tune, test and scale Apex applications Audience Developers Enterprise architects Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
DM7 Getting started with DM7 21 hours Audience Beginner or intermediate database developers Beginner or intermediate database administrators Programmers Format of the course Heavy emphasis on hands-on practice. Most of the concepts are learned through samples, exercises and hands-on development
hbasedev HBase for Developers 21 hours This course introduces HBase – a NoSQL store on top of Hadoop.  The course is intended for developers who will be using HBase to develop applications,  and administrators who will manage HBase clusters. We will walk a developer through HBase architecture and data modelling and application development on HBase. It will also discuss using MapReduce with HBase, and some administration topics, related to performance optimization. The course  is very  hands-on with lots of lab exercises. Duration : 3 days Audience : Developers  & Administrators
apachemdev Apache Mahout for Developers 14 hours Audience Developers involved in projects that use machine learning with Apache Mahout. Format Hands on introduction to machine learning. The course is delivered in a lab format based on real world practical use cases.
aifortelecom AI Awareness for Telecom 14 hours AI is a collection of technologies for building intelligent systems capable of understanding data and the activities surrounding the data to make "intelligent decisions". For Telecom providers, building applications and services that make use of AI could open the door for improved operations and servicing in areas such as maintenance and network optimization. In this course we examine the various technologies that make up AI and the skill sets required to put them to use. Throughout the course, we examine AI's specific applications within the Telecom industry. Audience Network engineers Network operations personnel Telecom technical managers Format of the course     Part lecture, part discussion, hands-on exercises

Upcoming Courses

CourseCourse DateCourse Price [Remote / Classroom]
Hadoop for Business Analysts - Buenos Aires - Laminar CatalinasTue, 2018-03-20 09:305259USD / 6344USD

Other regions

Consulting

Weekend Big Data courses, Evening Big Data training, Big Data boot camp, Big Data instructor-led , Evening Big Data courses, Big Data trainer , Big Data on-site, Big Data classes, Big Data training courses, Big Data private courses, Big Data coaching, Big Data instructor, Big Data one on one training

Course Discounts

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients