I Hate Regex

Every time I have to use regex, I must look up the Interne for the regex pattern syntax, try some variations with my code, and eventually falling back to some regex website to find pre-made patterns for my specific case. I know I’m not alone in this. I have at least half-a-dozen regex websites in my bookmarks. We know that new one of these pops up so very often, while the old ones vanish in the archives of the internet.

Here are some wise words from the Internet about this regex situation:

So here we have another contender with a difference: don’t just use; understand.

Yes, iHateRegex.io, and so does many, but this website is working against it. While you are here looking for your specific case of regret… I mean regex, the website helps you understand how the regex pattern is working to find the match from the given input by showing a visual explanation of the process. Pretty neat eh!

Image courtesy: iHateRegex.io’s GitHub repo

Found some useful examples here and I think they are doing a good service to the developer community. Bravo guys!

Website: https://ihateregex.io/

GitHub Repo: https://github.com/geongeorge/i-hate-regex

Go, give it a try.

Cheers.

A Simple Log Helper for Python

Just wanted to share this simple log helper method I came up with after some research, which can help you quickly setup Python logging and focus on your main app logic instead of dangling around in logging documentation.

The method is very simple, takes a filename and two other optional arguments, and returns ‘logging’ object, and you can start logging in your Python projects in no time!

Here’s the code:

#file log_helper.py
import os, sys
import logging

def logHelper(fileName, logLevel=logging.INFO, useConsole=True):
    """
    Simple Logging Helper. Retuens logger reference.

    Paramsmeters:
    fileName: Filename, may include full path, or will open a file in default folder
    logLevel: Pass logging.INFO, logging.DEBUG or other enums for logging level
    useConsole: If Ture, will also dump log to console
    """

    ##### init logging
    log = logging.getLogger()
    log.setLevel(logLevel)
    logFormatter = logging.Formatter("%(asctime)s | %(threadName)-12.12s | %(levelname)-5.5s | %(message)s")

    ##### file handler
    fileOut = logging.FileHandler(fileName)
    fileOut.setFormatter(logFormatter)
    log.addHandler(fileOut)

    ##### console handler
    if useConsole:
        consoleOut = logging.StreamHandler(sys.stdout)
        consoleOut.setFormatter(logFormatter)
        log.addHandler(consoleOut)

    ##### return reference
    return log

And usage is simple. Passing just filename defaults to ‘logging.INFO’ log level, and console output turned on:

#import module
import log_helper as log

# get logger
log = log.logHelper('logfile.log')
log.info("Got logger")

Or use optional arguments to set up logging level and console output:

#import module
import log_helper as log

# get logger
log = log.logHelper('logfile.log', logging.WARN, False)
log.warn("this is a warning message")

Additionally, you can always tweak the method as per your needs, for rotating file handler, additional output handlers and log output format.

Check out Python’s documentation for ‘logging’ module here for more info.

Bugs reports? Suggestions? Let me know if you found this useful. Leave a comment 🙂

Cheers,

// Sohail

Hi, how R you? I’m Python, Thanks. – R vs Python for Data Science

The debate is futile because every computer language has a reason to exist, otherwise *anything* can be achieved in C Language, on *any* OS… trust me!

We all have seen this debate of dotNet vs JAVA, until the emergence of Mobile Apps era, and now those debaters are busy doing apps for mobile platforms, in JAVA, and all sorts of “weird old” and “inspired new” languages… right? (Pun intended) 🙂

Then came the Big Data and Data Science boom, and so the debate of choosing between the two main open source languages R and Python, amongst the practitioners of Data Analysis and Data Science. I have been watching this argument for quite some time now, and in my personal opinion, it’s just a matter of choice, which involves ease of use, target platform, tooling, syntactical benefits, and largely, the function library (call it packages, modules or etc.) that the language and its community has to offer.

I personally like both. I use R with R-Studio for exploratory data analysis and data visualization tasks, and feel comfortable using Python with Visual Studio for tasks which require deep programming logic and automation of certain tasks, for example automation of OS specific jobs, machine learning tasks, web scrapping and use of APIs.

Just Before I posted this blog, I came to know that the KDnuggets’ Annual Poll for Analytics, Data Mining software used (May 2015) posted their results where R is the top choice for data miners and Python is gaining significant momentum, with 30% share in top tools. Check the link above for details.

Top 10 Analytics Data Mining Software 2015
Image Source: Kdnuggets.com May 2015 Poll

Guys at datacamp.com recently did this comprehensive infographic (see below) which includes a ton of research on the subject (Original article: Choosing R or Python for data analysis? An infographic). Although this work would have concluded the argument already, until Martijn Theuwissen and Anna Anisin from dominodatalab.com came in with this post “Comparing Python and R for Data Science” to put the final nail in the coffin of this argument.

Enjoy the infographics:

R vs Python Infographics
Image Source: datacamp.com

 

Source Credits:

Cheers,

// Sohail

Free VMware vSphere Tools

While looking for a backup solution for my VMware environment, I stumbled upon a very nice article by Kendrick Coleman on his blog on a top 10 list of carefully selected tools for managing your VMware vSphere deployments, and the best thing is that they are all free!!

Kendrick had also done video demos for these tools so you can pick the best without installing it.

Here is the link: Top 10 Free VMware vSphere Tools and Utilities for 2011 

Nicely done Kenny!

Cheers.

/via kendrickcoleman.com

Data Visualization with Microsoft Live Labs Pivot

UPDATE: According to Microsoft, the product is graduated to become a Silverlight web control, and is available at Silverlight PivotViewer. The website initially hosting Pivot at getpivot.com is no more active. Links blow are updated accordingly.

Yet another beautiful piece of software form MS Live Labs: Pivot, a fun way of exploring and visualizing data. Here