## Archive for the ‘Data Science’ Category

## My Email to my Appointed Mentor from UNISA

I was excited today to receive an email from Dr. Abraham Tlhalefang Motlhabane of UNISA who has been appointed to be my doctoral supervisor. Although, while Dr. Motlhabane has an excellent background in science education, I hope he will have sufficient background in statistical methods to help me get beyond my current limitations, or that maybe other UNISA professors can also help. After he emailed me, I wrote the following email, which I think is a good self-reflection of where I am on this project.

## Udacity is Guaranteeing Graduates will Get Jobs, is there a catch?

I wrote recently about my thoughts on whether MOOCs have been a failure. Udacity is showing that they are not, and is an example of where the potential of technology to “disrupt” a market is finally entering the realm of education. And it has now put its “money where its mouth is”, by doing something no college (that I know of) has done: guarantee its graduates a job. But what is the catch (if there is one)?

## Are We All Schrodinger’s Cat?

Over the past few years, I’ve often replied to questions about how I was doing, by saying I’ve felt a bit like Schrodinger’s Cat. In that I feel I don’t know “how I am”, because the fate of what that might mean in the near future is so unknown and having potentially dramatic differences depending upon the swirl of events surrounding my life.

But, I have come to realize our whole life, each of us are somewhat like Schrodinger’s Cat, living in bubbles of uncertainty, with unknown butterfly effects surrounding us; although, the biggest difference between us and the fabled unknown fated cat, is that we have the opportunity to make choices that feed into the intricate system of the universe. But it seems the best we can do is to model choice as randomness or at least probabilistic randomness.

## Three Missing Features in Most Student Information Systems (SIS)

There is a paradox: Humanity’s most developed organizations and systems are based upon what is learned in our education systems; yet, the field of education lags behind nearly all others. One such area I have seen, is how feature-poor Student Information Systems (SIS) are. Despite such systems being case studies in many database books, most of these systems do not use any data science methods to improve operations. Specifically, I have usually not seen active security, predictive analytics, nor even resource optimization as features. Here is why these are important to have, and my invitation for SIS providers to come into the 21st century.

## Computer Science Math Course & Study Group On Hold Indefinitely

Well, after putting over 50 hours of work into the creation of a high school math course focused on computer and data science (over Thanksgiving break)…. Yesterday threw me a curve ball at work, and it looks like this work won’t be relevant with Highlands Community Charter School for the time being. But, this doesn’t mean that it won’t come to fruition in the future, as the foundation I was building, and the Open Educational Resources (OERs) that I was compiling, I feel have really good value and are of excellent quality. But it does mean that the study group and class are on hold…

## An Update about the evolution of the

I wanted to give everyone an update about what originally was going to just be a study group about Information Theory, but has expanded into being a full high school course on the Logic, Algebra, and Statistics of Computer and Data Science.

## Introduction to Boolean Algebra & Information Theory: A Twelve Week Course

As I’ve been posting about, I’ve been working to learn more about information theory, to bridge my way to learning Bayesian analysis techniques along with other machine learning techniques. While at first I was only considering having a self-study group, I have come to believe that the first portion of this learning could be at a high school level (albeit advanced). So I am now planning on offering these topics as a class, where we will meet once a week, in the evening, probably on Thursdays at around 5:30. I have created a Moodle class that will have all the topics (you can log in as a guest to get in for right now). This class will be free, although if you are not an HCCS student (and if you have a high school diploma already, you really can’t be an HCCS student), then there won’t be any credit you earn. Although, if we do what I plan, and continue on to more rigorous work, we would have the potential for credit through LearningCounts.

## Proposed “Syllabus” for Module 1 (High School Rigor) of the Information Theory Self-Study Group

As I posted recently, there is a self-study group starting about learning Information Theory, that will ultimately parallel Oberlin’s Math 345 course, and thus be something we could earn upper division college credit through LearningCounts. But, I believe in learning starting from easy and conceptual to moving to more challenging and technical. Thus, the first third of the “course”, which I’m calling “Module 1”, is going to be at what the author of our main text deems to be at a high school level of math rigor. And we will mostly watch videos that are also geared towards high school students and/or the general public.

So I’m proposing for this study group the following sequence, with approximate dates of when we would complete different portions of our reading, and activities.

## Calling all Fellow Math Nerds: Let’s Learn Information Theory Together

I believe I finally have done enough research about statistical forms of regression, to see that the methods I will need to use in my doctoral research will require Bayesian methods and/or knowledge from Information Theory. So I am starting to dive into these, starting with Information Theory, and I want to see if any of my friends are interested in having a study group to learn this together. I believe if we do it with sufficient rigor, we can earn upper division math credit through LearningCounts; so this could be of value both personally and professionally.

## Non-Linear “Regression” using a Pattern Recognition Algorithm combined with Monte Carlo Simulations

For a bit of time, I have been trying to find a single standardized method of data mining to test any set of data across various curves of best fits with various underlying error distributions. (As I have realized my idea of removing “outliers” isn’t really solving the problem, because they are often there for a “reason”) Yesterday, I had a “crazy” idea that might just be able to solve this…