Category Archives: programming

Let dig out a Localtunnel!!!

So as I creating a very early version of the webapp, I want to get opinion from friends. A few thing happen, I’m not quite ready to commit to a full blown installation. And I am too lazy to setup a ssh tunnel, and I didn’t attempt it before.

Of course, there is option number 3. Localtunnel, basically localtunnel is a ruby script that make tunneling very easy, without you having a ssh box remotely. They provide a linux box remotely, which have no shell and only authorized_keys file in home. So you don’t need a dedicated ssh box somewhere.

On linux, make sure you have ruby install, and rubygems installed. On ubuntu, on fedora just replace with yum. The rest of the instruction is on the link below.

sudo apt-get install ruby1.8 rubygems

Python Web Scraping

There is time where there is information in govt website of is very useful, but unfortunately the data is in form of website, it could be worst as it can be in PDF. So it can be a pain if we wanted to use information for programming, but there is no API.

On the other python is a pretty powerful language. It comes with many library, include those that can be use to do HTTP request. Introducing urllib2, it is part of standard library. To use it to download data from a website can be done in 3 line of code

import urllib2 
page = urllib2.urlopen(url)

The problem, then is you get a whole set of HTML, which a bit hard to process. Then python have a few third party library, the one I use is Beautiful Soup. Beautiful Soup is nice that it is very forgiving in processing bad markup in HTML. So you don’t need to worry about bad format and focus to get things done. The library itself can also parse XML, among other thing.

To use Beautiful Soup,

from BeautifulSoup
import BeautifulSoup
page = “html goes here”soup BeautifulSoup(page)
value = soup.findAll(‘div’)
print value[0].text

But you need to get the html first don’t you?

import urllib2
from BeautifulSoup import BeautifulSoup
page = urllib2.urlopen(url)
soup = BeautifulSoup(page)
value = soup.findAll(‘div’)
print value[0].text

To use it, just download the data using urllib2 and pass to to beautiful soup. To use it is pretty easy, to me anyway. Though, urllib2 is going to be re organized in python 3. So code need some modification.

To see how the scraper fare, here is a real world example, in github part of a bigger project. But hey it is open source. Just fork and use it, in the this link.

So enjoy go forth and extract some data, and promise to be nice, don’t hammer their server.

Diff’ing and Patching

So I got like a few folders with different version source code, erm text files which I am comparing in my work. Either way, thanks to unix tools like diff and patch, things is some what easier. Though it is better to use version control, which I stupid enough not to

to compare files with diff, it is just a matter of

diff original new

then my team is creating the file on windows, which have \r\n, and me on linux, which \n. So to avoid this on diff

diff -w original new

this should able to ignore difference in white space character such as \r\n, which may not be a wise choice for python, because of the indentation sensitivity, which I trust my team sane enough not to mix tab and space(yes, for non-python progammer you can all laugh)

Since I have a directory, so i would run

diff -rw original-folder new-folder

pipe it to less just for making it readable.

If i’m lucky, it is just adding stuff to the text file, i would generate a patch, for that file by

diff -u original new > original.patch

and applying it with

patch original < original.patch 

in the folder. which i should really use -p1 option in the patch command, but there is a few folder have the same name, so i just put it in that folder and run it without -p1

of course sometime thing is not as easy. so it is nice to use to see all the difference with highlighting. which here i use vimdiff. which i use

vimdiff original new

Then I realize that doing all the diff and patch can be a pain in the back. So, a better use the code with version control, what ever reason, or how rush it is.

Working with dinosaur, aka cobol programming

Thus One have begin programming in cobol. Programming in cobol on Mainframe, enlighten me in many way.

Another, many things that I took for granted in programming, I really miss dearly

  • Interactive terminal is very useful, because on z/os everything is done on batches. So I have to submit the job, and wait for result then go to the sdsf utility to see the printout, luckily on screen. On unix(in my case linux), result is shown immediately, after compilation.
  • And because of it, the jcl, tend to do more than one job. In my case, there is this compile and link JCL file, that well compile and link.
  • On z/os, the utility program look as if it is obsfucated. The compilation program is based on a long acronym. I don’t even remember the name. But there is always a way to define userfriendly name.
  • There is no easy way to pass parameter. On unix it is done easily, but not on mainframe.
  • Worst there is no good development environment. On unix there exist emacs and vi. For programming, here I consider only console environment. It is a pain.
  • You realize that the platform is dated, even the JCL is keep the same models where it is from, the punch card, which means it is only 80 char wide and start with //
  • On cobol, you realize the program is long, you define the variable in the beginning. So one have to think of the variable first.
  • And it is too verbose.
  • There is no easy way, that I find now that able to print multiline string. As in python, hell even in c there is “/n”.
  • Like the mainframe. The design looks very dated. I mean, who use move to assign variable anymore. This not assembly………………..

Programming in mainframe provide interesting challenge. But it is also a pain, in many many way. Make it the last platform of choice to software development. One might not use it, unless forced to. Which is my case actually, because I cannot breach a contract(It seems fun that time, regretting now.) On other case, it is really the critical legacy apps. Which a bank have a lot. That they are force to keep it.

I actually understand why Djikstra is right, why the tao of programming is right. I begin to understand UNIX, a bit more. Most of the language that rank below of my personal list like perl, java and ruby, get high points now, except basic, which I still think too restricted. Because of restriction on cobol, I also like Lisp many more………

Strange experience but it is true…………….

Fun with python: web development with Django: part 1 setup

After setup my database, install the ide(the openkomodo post). Now, the next step is to start a project. And I decide to play with python, and django is in my list, because 1) it’s python, 2) it’s works on gutsy, where as turbogears don’t. 3) I kinda like the django’s own ORM.

First thing first, installation is dead simple,
1) get django

get tar file, It just happens I use that.
2) untar it somewhere,

3) run the command in the directory you untar’ed

python install

So nothing special.

One note here is, you will need a extra web server. It turn out that, django assumes that, there is a separate server, other than it’s own server, serving media like image and file. And it is a pain that to create a links etc. It’s actually easier to just enable userDir in apache. and put in the picture in public_html directory, in your home.

I need userdir because, it need a directory that can be written into. And since a ubuntu user account cannot directly access the apache root, which is in /var/root.

Now you have a web framework, that have a few interesting feature, and a internal server for development purposes

fun with python: running programs using popen

Yesterday, I run unix command, by using the command modules.

It turn out that there is another way, and more importantly cross platform. So it will run on windows.

Python provides a set of modules for generic operating system service. In the os module. And in it is a popen() function. Which can be used to call program,

so to run a program using popen, using “ls ~” as an example, it can be many program:

import os
comm=os.popen(“ls ~”)
for line in
print line

which will print the output. But for some reason, cat a file don’t work. You don’t really need to print an output, you can just use it to run a program.

openkomodo: an open source, code editor

Open Komodo, is an initiatif, by active state, to open source some of their software. Open Komodo is essentially Komodo Edit. A shrink down version of their IDE. So what happens here is that, they open source thei Komodo Edit, which is cost free anyway. But not much of an ide, but good enough for most task.

The editor, support quite a number of language, such as python, ruby, java, and a few. One notable exception is PHP, which is not in the list. Been testing it with python, because been playing around with django.

The basics is there, such as organize as project, code completion, which is totally useful. But the code completion feature, is a bit not quite there yet. Some code in the directory cannot be imported, using import. don’t really work.

Compared to ide, it is a bit barebone, but by then it is a bit like GUI version of emacs, and vi. Probably the reason, why I think it is quite fine.

One thing is it is still alpha, but it quite usable to me now. Probably there is more to come in the future.

One cool thing is, to install open komodo, on linux, any linux, just run the shell script, in shell, chmod +x

It will copy to your home directory, and create a shortcut at the desktop. to remove it, just remove the Directory created and the shortcut. It’s in the README file,

You can get Open Komodo here