Is education a data-intensive science? Should it be?

Can collecting and sorting through massive amounts of learner data improve education? It's par for the course in some other fields. If you still think of astronomers scanning the skies, peering through enormous telescopes on mountaintop observatories, or studying digital pictures coming back from the Hubble telescope, think again.

"People now do not actually look through telescopes. Instead, they are "looking" through large-scale, complex instruments which relay data to datacenters, and only then do they look at the information on their computers." Tony Hey et al, The Fourth Paradigm

Astronomy is a data-intensive science. Defined loosely, that means that there's just way too much data even for scientists to manage. Used to be, scientists had to carefully craft an experiment in order to generate data. Their job was to create precise, specific data that would then be carefully added to someone else's precise, specific data, and that would go on in an additive fashion for a while, and eventually, hey! A conclusion could be drawn.

Not anymore. In many fields today, scientists are simply awash with the stuff. It's a tsunami of data, it's... more data than any one metaphor can hold. It's downloaded from instruments, from automated inputs, millions of nodes, devices, all connected to computer databases, or even generated by computer networks themselves.

Once it's set up and put in motion, all this information is collected automatically, sometimes at terabytes per second. CERN's supercollider generates over 1,000 terabytes of data per second, a petabyte of data. To give you an idea how much that is, it takes more than 400 high-definition movies to add up to 1 terabyte, and we're taking a thousand times that much... Imagine 400,000 HD movies generated out of thin air every second. Store that on your DVR.

What do they do with all the data? Well, they dump it. Boatloads of perfectly good data are erased every day, by really good scientists, too, because they just can't keep it. They can only keep the final tallies, the results, the reports. There's a whole science now developing across these fields that is just about how to manage data, called "e-science." Microsoft gives an award for advancing it. Alexander Szalay of Johns Hopkins won it last year.

To try to get a feel for the magnitude of this, imagine so many Major League Baseball teams, and so many games being played every day, that you couldn't keep all the recordings of them anywhere. The evening sportscast would go like this: "And in baseball, there were ten trillion National League games played today, and here are the scores: the winners trended toward the home teams once again, by a 54 percent margin. Congratulations, home teams!" Want to watch some highlights? Sorry, they're gone forever... the recording was erased the instant the last out was made.

Is education a data intensive science? No. But it could be. The amount of data generated by educational activities every day is stunning. But not a large percentage is captured. How much communications data is generated every day? Just think about the phone minutes alone, being tracked and logged and billed globally. And education as an industry is about three times the size of both the entertainment and communications industries. It's bigger than both combined. Those industries have gotten very good at collecting up data, all kinds of it, and using it. They use it for billing, marketing, product improvement, service improvement, competitive analysis, pricing, investment...

Education and training? We're not so good at collecting data. But we are getting better. The data being generated by learning is more and more being collected digitally and used for similar purposes--which may be good or bad, but is likely inevitable. But it's also being mined for loftier reasons. Arizona State University now uses data mining techniques to create student profiles that help guide students through their college careers. Check out this article from the Chronicle of Higher Education.

But ASU is just scratching the surface. A larger and larger percentage of the learning being done in schools, universities, and the workplace is being done using digital tools... online courses, digital learning objects, digital textbooks, simulations. Final grades are a tiny percentage of the data that is already being collected. Learner data that can be and is being collected during the trial and error of the actual learning experience, the clicks and drags and searches and reviews and posts and responses and ratings and submissions.

The global push toward measuring learning outcomes, toward judging the quality of education by what students can or can't do, what they do and do not achieve... that cannot now be disconnected, never again will be disconnected, from tracking and progress data. The era of a final, a mid-term, six quizzes and a paper adding up to the sum total of a student data? That's over. Collecting and using data to measure actual learning success, or lack of it, is exactly what makes high-profile efforts like Khan Academy work.

Should education be a data-intensive science? Moot question. It almost certainly will be. Better question: how do we make it as good as it can possibly be?

How eLearning became Educational Publishing. And vice versa.

I recently had a conversation with the highly accomplished head of a highly respected global education organization that you would recognize if I named it. I'm not going to name it, though, because I'm going to tell this story. And even though it's not a negative story, it does illustrate a gap that has yet to close, and I don't want to single out any entity or person. We were discussing that organization's online learning division, and I mentioned that it might make sense if he merged it with their publishing division. He looked genuinely bewildered, and I got the feeling he rarely looked, or felt, genuinely bewildered. "Why would we do that?"

Now it was my turn to look bewildered. I really didn't know where to begin, because it seemed so completely logical to me. I knew I needed a thirty-second elevator speech as a response, and I also knew that in order to provide one, I would have to condense about twenty years of my own hard-won experience into the answer.

What I managed was something like "Online courses are digital products. So are books and other materials, even if they are printed before they're sold. At the root, they're really the same thing... digital products that have to be designed and developed using pretty much the same set of capabilities and processes. Uniting them in the same organization makes both more efficient." I'm very sure my answer didn't make quite so much sense as what I just wrote out from memory, but that's the nice thing about blogging... you get to be the reporter, the subject, the editor, and publisher at the same time.

What I couldn't do then is what I want to do now... lay out a considered case for the idea that digital publishing and the development of online learning courses are essentially the same thing and ought to be considered together from now on going forward forever. Here's another way to state my thesis...

Twenty years ago when I first ventured into "distance learning," I wasn't in the publishing business. Now, I can look back and say with complete accuracy that I've been in educational publishing for twenty years. That's because the definition of publishing has changed, and will continue to change, metamorphing these two pursuits into one.

Just for fun, let's take a look at the dictionary definitions of publishing, as they have evolved.

Merriam-Webster, 1976: "The business or profession of the commercial production and issuance of literature, esp. in book form for public distribution or sale.", 2012: "The business or profession of the commercial production and issuance of literature, information, musical scores or sometimes recordings.", 2012: "To issue (printed or otherwise reproduced textual or graphic material, computer software, etc.) for sale or distribution to the public."

Wikipedia, 2012: "The process of production and dissemination of literature or the activity of making information available to the general public."

Notice that even though the last 3 definitions are from the current year, there is an obvious progression of sources, from a staid, pre-web company now online ( to one of the original dot-coms ( to a true Web 2.0 entity (Wikipedia). And the progression is unmistakeable... each definition is broader than the previous, until finally Wikipedia just says that publishing is taking "information" (as broad a noun as you could choose) and "making [it] available" (as broad a verb as you could choose) to the "general public" (as broad an object as you could find). But that's where we are.

But to make my case, I won't rest on definitions. I want to take a peek into what's been happening with textbook publishers, and compare it to what's been happening with the developers of online courses. Let's pick higher education for our example, but you could pick K-12, corporate training, continuing professional education, anything you like and the same point could be made.
What's been happening with textbook publishers is that they've been moving online. Earlier in this millennium, every textbook had to have a CD-ROM to go with it, so that students could plug something into their computers and interact with the content. But those quickly gave way to the now-ubiquitous user codes that allow textbook purchasers to simply log on to the textbook website, the one designed specifically for this particular edition of this textbook, and get... an online learning experience. Here's the list that is actually published in the front matter of a well-known college Calculus textbook... the things that come along with the price of this book:
  • Online homework practice
  • Testing
  • Tutoring
  • Graded homework
  • Classroom management
  • Online course
  • Interactive resources
Pretty much everything that defines online learning, including... the online course!

This is not unusual. Publishers are under pressure to provide an online course to go along with their textbook, and most develop one. Sometimes these come in the form of a "course cartridge" that professors can plug into Blackboard or Moodle or whatever LMS they have, but often it's just... the course that goes along with the textbook. It's there online, in the publisher's own learning management system. If your professor wants to teach the whole course online, there's nothing stopping her.

Pearson is the biggest, most successful textbook publisher on earth, and a few years back they bought one of the leading LMS companies, eCollege. They have known about this merger for years that the definition of publishing looks more like Wikipedia's than like's.

Now let's look at it from the eLearning side. In higher ed, online learning started out in the late 90's with a single professor signing into a Blackboard account, learning how to upload documents and write out class lectures so he could teach his own students. This was the Web 1.0 variation of what faculty always have done. Just doing it online instead of in a classroom. Now, the most successful online programs are universally acknowledged to come from the for-profit universities: Phoenix, Walden, Capella, Kaplan. It's true, they are the most successful. They are also the best. They have the best online programs because they figured out early that courses are products. When they go about creating a course, they invest in all the same things that publishers do. They hire designers, writers, editors. They have people in the traditional publishing roles, even if they don't call them that, and probably didn't hire them out of a publishing background. Here are the publishing roles, and every successful eLearning entity does them:

Acquisition -- deciding what to publish, and who the subject matter experts are
Development -- the equivalent of writing the course
Editing -- making the rough draft into a polished product
Design -- deciding on, and sticking to, a certain look-and-feel, and a user interface, and a standard progression through the content
Production -- pulling it all together, with video and interactivities, assessments, and quality control.

The most successful textbook publishers are eLearning producers. The most successful eLearning producers follow publishing processes. The gap between the two is vanishing. And that's why it made such obvious sense to me that the publishing division and the eLearning division ought to be connected.

I have one more proof for my thesis, which I'm not going to put directly into this blog post, but I'll link to it. I went back through my own twenty-year career and laid it out as if I had always been in the field of Digital Educational Publishing. Very eye-opening. Take a look here, and let me know if you disagree.

Flipping the classroom. It's time.

For years I've been talking about how online learning creates a "reversal of fortune," because in a classroom the student is entirely on the teacher's turf, but as soon as you put learning online it's the opposite. It's the teacher and the learning that has to adapt to the student's personal environment. This reversal has enormous ramifications, the top of the heap being that online learning must now be considered a product in ways the classroom does not. You can't assume learners will follow your rules just because you say so.

The reversal of fortune has now evolved, and its offspring is the flipped classroom. First let me define The term, so there's no confusion: "Flipping the classroom" is, at its most basic, asking students to learn the content at home and practice it in the classroom, instead of the other way around. When the Internet is available everywhere, there's no earthly reason that valuable class time should be wasted on lectures. Those can be recorded and watched at home. And then that valuable class time can be used to mentor, coach, facilitate, interact, answer questions, test, and in every possible way to make sure each student has learned the material and can put it to appropriate use.

Pearson has been flipping the classroom with its MyMathLab product for years, though sometimes without the knowledge of the faculty. I've blogged about this in past posts. The MyMathLab product presents itself as homework, but in fact it teaches. Teachers can choose either to flip the classroom by assigning the homework first, then talking it through and practicing it in the next class session, or they can just pretend that their lectures have suddenly gotten much better (I'm sure this is rare) and reap the rewards of an almost-foolproof homework assignment.

So how does this relate to the online learning "reversal of fortune?" The flipped classroom grew out of it. It's a part of it. Without quality online content of some sort, students are not going to be able to learn the content by themselves at home. After all, teachers have been assigning a textbook chapter to be read ahead of class for years. Textbooks are just not effective enough to actually teach. In order for the classroom to be effectively flipped, you need to have a highly effective learning product online. And that's where it becomes necessary to "productize" learning, to think through all the product details (look, feel, flow, interfaces--and the simple focus on the right content at the right time in the right manner to the right audience).

Let's take a case in point, which has recently hit the national consciousness through "60 Minutes," Kahn Academy. Parents and their kids have known about this homework secret weapon for years. Kahn Academy is not where you go when you want to practice what you've learned in class; it's where you go when you don't get it. It's where you go when what was taught in class just didn't make any sense, and Mom and Dad can't help.

Kahn Academy is, or at least it was, little more than a collection of mini-lectures that teach math and science concepts through the simple device of a really gifted teacher, Sal Kahn, talking aloud while writing and illustrating in multiple colors on a black screen. Nothing could be simpler. Or could it? The fact is it's not nearly as easy as it looks... it has taken off not because of the format, but because Sal Kahn has this amazing capacity for explaining the complicated in very simple ways.

As 60 Minutes pointed out, some teachers resent him because they think of themselves as lecturers, and he's better at it. Others embrace him because they know they are really educators, and he helps their students actually learn. And to those teachers who do get it, Sal Kahn is not replacing them; he's replacing the textbook. He is the textbook. He is the chapter of the textbook that can be assigned ahead of class time, so that students arrive already familiar with the concepts. Except unlike a traditional textbook, with Sal Kahn they actually will learn ahead of time.

Sal knows, as do all those teachers who use his stuff, that a series of video lectures does not an education make. In fact, Sal knows this so well that he has (with a little help from his new friend Bill Gates) built a Learning Management System for teachers to use, to help them facilitate learning in their classrooms. The result is something truly innovative. I'm going to say that it might even be the first truly new technology that has been designed exclusively for higher-levels of learning (say, above 6th grade); the first one that doesn't just borrow technologies created for entertainment or communications. It has an enormous capacity to change the way education works, because it is designed for no other purpose than to make sure education works. What a concept.

But Kahn Academy is not the only entity contributing to the classroom flip. Check out This is a technology that all teachers can use to flip their own classrooms, right now, today. You don't need Sal for this... you can put your own lectures or lessons online and start assigning them as pre-work for the study sessions you have in the face-to-face classroom.

What does all this mean? It means that online learning, eLearning, distance learning, is having its inevitable impact. The inevitable is this: education will progress down the path of technology just as entertainment has, just as communications have. More and more of it will be mediated through technology. Why? Because people will keep finding ways in which using technology actually improves learning.

Flipping the classroom is only the beginning. We are still scratching the surface. But that scratch is starting to satisfy a much bigger itch.