Big Data and Analytics in Education and Learning

With the growth of the internet, mobile technologies, multimedia, social media and the ever increasing Internet of Things, the data we can mine effectively as well as the types of information we can process from that data are evolving rapidly. In a recent report, McKinsey Global Institute estimated that the amount of data increase globally is roughly 40%. The term “Big data” has emerged to describe “datasets whose size is beyond the ability of typical database software tools to capture, store, manage and analyse” (McKinsey, 2011). Big data represents data sets that can no longer be easily managed or analysed with traditional or common data management tools, methods and infrastructures. According to Gartner, the challenges of Big data come from three dimensions:

Volume: means the increase in data volumes within enterprise systems will cause a storage issue and a massive analysis issue.

Variety: means different types of information from various sources are available and need to be analysed, including databases, documents, e-mail, video, still images, audio, financial transactions, etc.

Velocity: means both how fast data is being produced and how fast the data must be processed to meet demand. This involves streams of data, structured record creation, and availability for access and delivery. (Gartner, 2011)

These characteristics bring new challenges to traditional Business Intelligence (BI) and analytics and require new approaches, new software tools, and new skill sets to manage and extract value from new, complex, unstructured and voluminous data sources.

Big Data has made its way onto the Gartner Hype Cycle for 2011 for mainstream adoption in 2 to 5 years. According to Gartner, “By 2015, companies that have adopted big data and extreme information management will begin to outperform unprepared competitors by 20% in every available financial metric”. It is predictable that big data will provide new opportunities for data service providers, content/information publishers, and software companies to offer optimized services and platforms that help organizations make better business decisions. For example, Oracle has developed a comprehensive Big data strategy, which includes releasing Hadoop data-management software, a NoSQL database and R analytics. IBM has also unveiled InfoSphere BigInsights platform for big data analysis. Many governments, sectors and corporations have seen Big data as a key strategic business asset of the future development and have started to experiment with Big data technologies as a complementary or alternative form to traditional data management and analysis.

How will HE institutions address the opportunities and challenges for Big data in education? According to MGI Big Data report, Education in the US is the tenth largest data sector, which stores and manages approximately 267 petabytes of information. However, compared to other sectors, Education faces higher hurdles because of the lack of a data-driven mind-set and available data. With an increased focus on such issues as data-informed accountability and transparency, emphasising student retentions and academic achievements, teacher performance and added value and productivity in education, big data will play an important role in guiding education reform, helping institutions to develop business strategies and assisting educators to improve teaching and learning. Predictably, while all sectors are facing the challenges of making effective use big data, several general development trends for big data in education can be detected for the future, for example:

  • One of the key challenges for big data in education is to develop data informed mind–sets and to make sure that educational data are effectively managed and available for end users. It is clear that the use of Big data is different from traditional data mining, and it requires new approaches, new tools, and new skills to deliver the promise of BI and analytics. In order to optimise the use of big data, institutions will need not only to put the right talent and technology in place but must also structure their workflows and incentives to promote data informed decisions at all levels.
  • One of the real opportunities for big data in education is to integrate information from multiple data sources. This means working with significantly greater data sets to store and mine all the unstructured and structured data to which institutions have access. These will include scientific research, library resources and administrative information, as well as data sets collected via LMS platforms and other sources to help institutions make smart decisions that lead to real success on e.g. development strategies and organisation management, student recruitment, international markets and intelligent curricula.
  • A shift from data collecting to data connecting. The potential of big data and analytics in education is to connect the unstructured and structured data effectively to identify and leverage the real learning patterns that lead to student success. Mining unstructured and informal connections and information produced by students in this way, including blogs, social media networks, machine sensors and location-based data, will allow educators to uncover facts and patterns they weren’t able to recognise in the past.
  • A new way to manage and use much larger sets of real-time student data. The real-time, contextual data could be used to provide real-time intelligence about learners and their collective/connected learning environments and contribute to open-ended and student-directed learning. For example, mobile analytics can be used to take advantage of the contextual data including tracking learner attention, behaviour management, truancy, teacher performance evaluation and school dashboards, etc.

Big data related technologies and applications:

  • Cloud computing,
  • Linked data
  • Metadata
  • Mashup
  • Stream processing
  • Visualization
  • Google’s MapReduce and Google File System
  • MapReduce & Hadoop
  • InfoSphere &BigInsights

Further reading:

Big data: The next frontier for innovation, competition, and productivity. http://www.mckinsey.com/mgi/publications/big_data/pdfs/MGI_big_data_full_report.pdf

“Big data” prep: 5 things IT should do now. http://www.computerworld.com/s/article/9221055/_Big_data_prep_5_things_IT_should_do_now

Big Data and Education. http://blog.xplana.com/2011/08/big-data-and-education/

Hype Cycle for Emerging Technologies, 2011, http://www.gartner.com/DisplayDocument?ref=seo&id=1754719,

Penetrating the Fog: Analytics in Learning and Education. http://www.educause.edu/EDUCAUSE+Review/EDUCAUSEReviewMagazineVolume46/PenetratingtheFogAnalyticsinLe/235017

The cloud is for the boring

Members of the Strategic Technologies Group of the JISC’s FSD programme met at King’s Anatomy Theatre to, ahem, dissect the options for shared services and the cloud in HE.

The STG’s programme included updates on projects of the members as well as previews of the synthesis of the Flexible Service Delivery programme of which the STG is a part, and a preview of the University Modernisation Fund programme that will start later in the year.

The main event, though, was a series of parallel discussions on business problems where shared services or cloud solutions could make a difference. The one I was at considered a case from the CUMULUS project; how to extend rather than replace a Student Record System in a modular way.

View from the King's anatomy theatre up to the clouds

View from the King's anatomy theatre up to the clouds

In the event, a lot of the discussion revolved around what services could profitably be shared in some fashion. When the group looked at what is already being run on shared infrastructure and what has proven very difficult, the pattern is actually very simple: the more predictable, uniform, mature, well understood and inessential to the central business of research and education, the better. The more variable, historically grown, institution specific and bound up with the real or perceived mission of the institution or parts thereof, the worse.

Going round the table to sort the soporific cloudy sheep from the exciting, disputed, in-house goats, we came up with following lists:

Cloud:

  • email
  • Travel expenses
  • HR
  • Finance
  • Student network services
  • Telephone services
  • File storage
  • Infrastructure as a Service

In house:

  • Course and curriculum management (including modules etc)
  • Admissions process
  • Research processes

This ought not to be a surprise, of course: the point of shared services – whether in the cloud or anywhere else – is economies of scale. That means that the service needs to be the same everywhere, doesn’t change much or at all, doesn’t give the users a competitive advantage and has well understood and predictable interfaces.