Learning Path to AI from a DBA - Need Help

srini · March 6, 2019, 5:04pm

Could someone guide me in progressing towards AI/ML career. I currently work as an Application DBA.

I have planned the below learning path

Hadoop (General Concepts) -> Cassandra -> Spark & PySpark -> Machine Learning (including packages like Tensorflow)

Why Hadoop you may ask… I want to learn to pipe data from big data.
Why Cassandra you may ask…I have to anyway learn this for my work.
Why PySpark you may ask…I want to learn python which is an all rounder.

Also if someone can help me with setting the environment in public cloud platform, it would be of great help.

srini · March 7, 2019, 4:04pm

Anyone? Also could you suggest some resources?

halachal · March 7, 2019, 4:14pm

I’m no expert on the big data stuff, but people have been saying hadoop is obsolete for several years now…

With your background you are likely aiming at a data engineer role, probably a good idea to browse some job postings and look at the buzzwords they’re asking for.

denzuko · July 24, 2019, 5:52pm

We have that environment on communitygrid.dallasmakerspace.org already.

One can use tensorflow.dallasmakerspace.org for ML work and hadoop / blockchain (geth) / faas is available via the docker cluster.

MapReduce (heart of hadoop) can be done in python so one doesn’t just need to learn individual “technologies” they’re really just suites of the same tooling set (ie clustering, MapReduce, data science, nosql and sql).

At the end of the day if one is trying to introduce machine learning into their big data stack then one’s just trying to do predictive modeling on existing datasets. Casandra would only be there as an intermediary datastore while the actual data lives in a “data lake”.

If I may suggest; take our Data Science course:

Almost forgot; data science is not as sexy as its portrayed: