Thursday 5 May 2011

Top tips for finding a new job

Deborah Meaden's top tips for finding a new job
As savings are made in the public sector, thousands of employees will be looking for new jobs this year. The government hopes the private sector can pick up the slack, but are former public sector workers equipped for change?
Dragons' Den investor and successful entrepreneur Deborah Meaden, one of four business mentors helping public sector workers facing redundancy in Newsnight's Job Market Mentors, gives her top tips on making the transition. DON'T DELAY

Often when faced with something pretty traumatic like losing your job, people have a tendency to bury their heads in the sand.
Don't. Doing so is bad for you, and it sends a bad message to any potential employers.
If you know you are going to lose your job then be proactive and start to plan before it even happens.
If it has just happened then get moving now.

Your CV needs more than just a brush-up. It needs to be completely re-worked and then carefully tailored for each and every job that you go for.
If you send the same CV out for 10 different jobs, nine times out of 10 you will be sending the wrong CV.
Employers in the private sector often have pre-conceptions about people from the public sector - that they are risk averse, that they want long holidays and short hours - you need to be able to dispel all of those pre-conceptions.
Realise that you are in a new environment now and make sure that your CV works for that new environment.
Don't just dwell on what you have done, but really think "how do I fit that job?" then tinker with your CV to show that you do.
Stop thinking in terms of what you do and fixating on the job that you have had. Start thinking instead about the transferable skills it takes to do that job. I could take anyone who has had a job and apply at least 75% of the skills that they have acquired in it and use them in another role.

Get used to the idea that the first step you take into the private sector may not be perfect.
You may not get your ideal job straight away, it may be the second job or even the third that fits best, but you have to take that first step and get going.
And take that step with enthusiasm - private sector employers are waiting for that hangdog public sector attitude that some think exists, that lack of enthusiasm, and you need to show that's not you.
I would rather get out there and be working, even if it is not in the best job in the world, than be sitting at home doing nothing.
Even if what you are doing isn't the perfect job it can still be a very good thing. Because not working for a long period can be dangerous.
It is very, very easy to get out of the habit of working, and if you have long breaks between jobs on your CV it sends a bad message to prospective employers.

Talk to people. Lots of jobs and business opportunities come out of conversation.
People in this country often find it very difficult to say "I'm good at that", but so long as you don't say it with arrogance, and aren't afraid to also admit when you aren't so good at something there is nothing wrong with it.
So think about what you are good at and talk to people, network as much as possible, keep your radar switched on and bring out ideas and plans through conversation.
If you've been working in the same job for a long time it could suggest to a private sector employer that you have chosen to stay in a safe environment and are perhaps starting to wind your career down.
You need to be sensitive to that and ensure that instead of fuelling that notion, you eradicate it. Be enthusiastic and passionate. It is not about age, it is about attitude.

Often the best things come out of really difficult times. That idea you've been considering, that change of direction you've been thinking about - now is the time.
You could now be saying to yourself: "What have I got to lose?"

Friday 11 February 2011

How to edit PATH?

How to edit PATH?

setting the path is very easy.

if you use a bash shell (as you probably do) just type

$ export PATH=$PATH:/im/appending/this/to/the/path

if you want to make the change came true everytime (otherwise you will have to type it everytime you open a new shell) there's a hidden file in your home:


Edit it with you text editor and append the line above to the file.
This file contains a bunch of command that are always executed everytime you start a shell. So edit it and then restart a shell and you will have all those command executed. If you appended the comand above you will have your path set too.

In this same file you can actually do some other cool stuff, for example you can set some alias.

Alias are very useful!!
I guess you get bored as anybody to have always to write long command like:

$ sudo apt-get install mypackage

you can then define an alias in your .bashrc like this :

alias ai='sudo apt-get install'

so you can istall stuff just typing:

$ ai mypackage

You can also define an other alias like

alias asrc='sudo apt-cache search'

so you can search a, for example, browser package from commanf line just typing:

$ asrc browser

and so on. There are already a few examples in the files that are commented, like

alias ll='ls -al'

which is pretty useful too, faster and better than usual ls. If you like it just take out the # from the baginning of the line to activate it.

Be careful to check if your alias already exists as a command anyway, or you risk to "hide" some useful command.

If you for example redefine cd

alias cd='ls'

everytime you use cd to change directory it will actually list the content of the directory. You probably shoudn't want to do that!

Saturday 22 January 2011

顺其自然 let it be


Don't trouble trouble until trouble troubles you. 别自找麻烦了,顺其自然吧。
Some of us like to have our future mapped out, others like to go with the  flow.
 It's best to take the world as you find it, then you won't be disappointed.
Be that as it may, I'll leave it as it is.  无论如何我们也只能听其自然。
Since there is nothing we can do, why notlet nature take its course?
Your opposition won't make any difference, just let it be.  
 We will cross the bridge until we come to it.  走一步看一步吧,顺其自然。
 Whatever will be, will be.   万事不必苛求,顺其自然。
The Beatles - Let it Be Lyrics
When I find myself in times of trouble, mother Mary comes to me,
speaking words of wisdom, let it be.
And in my hour of darkness she is standing right in front of me,
speaking words of wisdom, let it be.

Let it be, let it be, let it be, let it be.
Whisper words of wisdom, let it be.

And when the broken hearted people living in the world agree,
there will be an answer, let it be.
For though they may be parted there is still a chance that they will see,
there will be an answer. let it be.

Let it be, let it be, .....

And when the night is cloudy, there is still a light, that shines on me,
shine until tomorrow, let it be.
I wake up to the sound of music, mother Mary comes to me,
speaking words of wisdom, let it be.

Let it be, let it be, .....

Thursday 13 January 2011

'Used to do' or 'use to do' vs would & Used to something

'Used to do' or 'use to do' vs would 

If we say something used to happen we are talking about repeated events and actions in the past, usually things that happened a long time ago and are now finished.
To express this we can use either used to or would.
  • When I was young I used to play with my dolls. = When I was young I would play with my dolls.
Of course I no longer play with dolls!
  • We used to go out a lot in the summer.
Implies that we no longer go out much.
If you want to talk about repeated states or habits in the past, you must use used to, you cannot use would : :
  • My dog used to bark at cats.
  • I used to smoke.
  • I used to be an administrative assistant.
  • I used to live in England.
You should use 'use to' without a d in sentences when it follows 'did' or 'didn't' (don't worry too much about this because lots of people get it wrong).
The question form is ‘Did you use to…?'. When asking a closed question you put did/didn't in front of the subject followed by use to, you cannot use would.
  • Did you use to go out with my sister?
  • Did they use to own the company?
  • Didn't we use to go to the same school?
Also when asking questions about states in the past you cannot use would.
  • What sort of things did you use to like when you were young?
. In the negative you cannot use would without a change in meaning.
  • I didn't use to play with my dolls.
If I said I wouldn't play with my dolls. It would mean I refused to play with my dolls.
  • We didn't use to go out much in the winter months.
If I said we wouldn't go out much. It would mean we refused to go out much.
!Note - The general rule is when there is did or didn't in the sentence, we say use to (without d) when there is no did or didn't in the sentence, we say used to (with d).

Used to something

 used to has another meaning, it can be used as an adjective and we use it to talk about things that have become familiar, and are no longer strange or new.
Used to usually comes after verbs such as be, get or become.
  • After a while you get used to the noise.
  • She will become used to the smell.
  • I was used to the web site.
You can also say that someone is used to doing something.
  • I'll never get used to getting up at six o'clock in the morning.
  • It took me a while until I was used to driving on the right-hand side of the road.
Wednesday 12 January 2011

Mine Sovos SVEBK5B 5 inch E-Reader+ (Black)

  • Key Features:
  • 5" Easyread TFT EBook Reader with Audio playback /Photo/Movie support
  • 1GB Internal memory
  • SD card slot (can be 8G )
  • USB 2.0 port
  • Intregal Speaker
  • 3.5mm Stereo earphone jack
  • Calendar function
  • Voice recorder function
  • Screen rotate function
  • Power Off timer
  • 3 OSD Languages (English / Swedish / Norwegian)
  • EBook Formats supported:ANSI/UNICODE TXT/PDF/HTML/FB2/PDB/EPUB
  • Audio Formats supported:
  • MP3 - Bit Rate: 8Kps - 320Kps
  • WMA - Bit Rate: 5Kps - 320Kps
  • Image Formats supported: JPEG, BMP
  • Other Features:
  • Rechargeable 7.4V Li-Polymer Battery - 1900mA

My Brief Review:  This item is 5starts   Customer rating: 5 out of 5 stars
Why I gave it 5 stars?

For its multi-functions, not just a E-reader, but a music and video player, does everything it clams. for this price its a bargain and  really deserve the money.

But , but... it does have shortcomings, especially for people like me getting used to smart phones--mine is HTC Desire---you will find the item's function is very limited and just forward, it even does not have a home button, every time you wanna do back, you have to press the 'BACK' button many times until you get lost(because its not very fast and cannot catch up with the speed i press the button, as i said i got used to HTC Desire's speed).

Not works very well with PDF files, but the EPUB files is good enough, i just using it reading news transferring from  caliber, if you do like buy this item, i recommend you using caliber with it, you can download a portable version from Portable apps, OR you just using this one to directly DOWNLOAD )

In a word, it deserves this price, if you wanna a E-reader and multimedia player, I recommend you buy it.

One more thing, if you wanna enjoy music you have to buy your own headphones coz the one with the Ereader is really crappy. :D

Tuesday 11 January 2011

HOWTO: Notepad++ on Ubuntu via WINE

HOWTO: Notepad++ on Ubuntu  via WINE

Steps(Easier than expected):
  1. System > Administration >Synaptic Package Manager
  2. Supply password and Quick Search for "WINE" (no quotes)
  3. Mark WINE for installation and Apply
  4. (let the package manager install the dependencies it needs)
  5. Download Notepad++ 
  6. Extract the zip file
  7. Create a  application launcher with  command of:
    wine <path-to-notepad++.exe>
For instance:
 Download the file and  unzip to the "notepad" folder under "Downloads" file.  
the corresponding notepad application launcher command is:
wine /home/srs/Downloads/notepad/ansi/notepad++.exe

Some useful resources to read if the above doesn't work for you would be the "Official"documentation.  You can also google notepad++ and WINE to get some more gems.

 WINE let's you run a good smathering of Windows applications on Linux.

Monday 3 January 2011

Google Research:Lessons learned developing a practical large scale machine learning system(including comments)

Come from:

Lessons learned developing a practical large scale machine learning system

Tuesday, April 06, 2010 at 4/06/2010 08:00:00 AM

When faced with a hard prediction problem, one possible approach is to attempt to perform statistical miracles on a small training set. If data is abundant then often a more fruitful approach is to design a highly scalable learning system and use several orders of magnitude more training data.

This general notion recurs in many other fields as well. For example, processing large quantities of data helps immensely for information retrieval and machine translation.

Several years ago we began developing a large scale machine learning system, and have been refining it over time. We gave it the codename “Seti” because it searches for signals in a large space. It scales to massive data sets and has become one of the most broadly used classification systems at Google.

After building a few initial prototypes, we quickly settled on a system with the following properties:

  • Binary classification (produces a probability estimate of the class label)
  • Parallelized
  • Scales to process hundreds of billions of instances and beyond
  • Scales to billions of features and beyond
  • Automatically identifies useful combinations of features
  • Accuracy is competitive with state-of-the-art classifiers
  • Reacts to new data within minutes
Seti’s accuracy appears to be pretty decent. For example, tests on standard smaller datasets indicate that it is comparable with modern classifiers.

Seti has the flexibility to be used on a broad range of training set sizes and feature sets. These sizes are substantially larger than those typically used in academia (e.g., the largest UCI dataset has 4 million instances). A sample of the data sets used with Seti gives the following statistics:

Training set sizeUnique features
Mean100 Billion1 Billion
Median1 Billion10 Million

A good machine learning system is all about accuracy, right?

In the process of designing Seti we made plenty of mistakes. However, we made some good key decisions as well. Here are a few of the practical lessons that we learned. Some are obvious in hindsight, but we did not necessarily realize their importance at the time.

Lesson: Keep it simple (even at the expense of a little accuracy).

Having good accuracy across a variety of domains is very important, and we were tempted to focus exclusively on this aspect of the algorithm. However, in a practical system there are several other aspects of an algorithm that are equally critical:
  • Ease of use: Teams are more willing to experiment with a machine learning system that is simple to set up and use. Those teams are not necessarily die-hard machine learning experts, and so they do not want to waste much time figuring out how to get a system up and running.
  • System reliability: Teams are much more willing to deploy a reliable machine learning system in a live environment. They want a system that is dependable and unlikely to crash or need constant attention. Early versions of Seti had marginally better accuracy on large data sets, but were complex, stressed the network and GFS architecture considerably, and needed constant babysitting. The number of teams willing to deploy these versions was low.
Seti is typically used in places where a machine learning system will provide a significant improvement in accuracy over the existing system. The gains are usually large enough that most teams do not care about the small differences in accuracy between different flavors of algorithms. And, in practice, the small differences are often washed out by other effects such as better data filtering, adding another useful feature, parameter tuning, etc. Teams much prefer having a stable, scalable and easy-to-use classification system. We found that these other aspects can be the difference between a deployable system and one that gets abandoned.

It is perhaps less academically interesting to design an algorithm that is slightly worse in accuracy, but that has greater ease of use and system reliability. However, in our experience, it is very valuable in practice.

Lesson: Start with a few specific applications in mind.

It was tempting to build a learning system without focusing on any particular application. After all, our goal was to create a large scale system that would be useful on a wide variety of present and future classification tasks. Nevertheless, we decided to focus primarily on a small handful of initial applications. We believe this decision was useful in several ways:

  • We could examine what the small number of domains had in common. By building something that would work for a few domains, it was likely the resulting system would be useful for others.
  • More importantly, it helped us quickly decide what aspects were unnecessary. We noticed that it was surprisingly easy to over-generalize or over-engineer a machine learning system. The domains grounded our project in reality and drove our decision making. Without them, even deciding how broad to make the input file format would have been harder (e.g., is it important to permit binary/categorical/real-valued features? Multiple classes? Fractional labels? Weighted instances?).
  • Working with a few different teams as initial guinea pigs allowed us to learn about common teething problems, and helped us smooth the process of deployment for future teams.
Lesson: Know when to say “no”.

We have a hammer, but we don't want to end up with bent screws. Being machine learning practitioners, it was very tempting for us to always recommend using machine learning for a problem. We saw very early on that, despite its many significant benefits, machine learning typically adds complexity, opacity and unpredictability to a system. In reality, simpler techniques are sometimes good enough for the task at hand. And in the long run, the extra effort that would have been spent integrating, maintaining and diagnosing issues with a live machine learning system could be spent on other way of improving the system instead.

Seti is often used in places where there is a good chance of significantly improving predictive accuracy over the incumbent system. And we usually advise teams against trying the system when we believe there is likely to be only a small improvement.

Large-scale machine learning is an important and exciting area of research. It can be applied to many real world problems. We hope that we have given a flavor of the challenges that we face, and some of the practical lessons that we have learned.


Mr.Wizard said...
Can you give some examples of some places where this is used?
methode said...
Google Translate, i guess :) It would be quite stupid to not use it on a service like Translate. Or I imagine it's used in the "Did You Mean..." service as well.
Glowing Face Man said...
How about some links where we can see this in action :)
threeiem said...
It would be great if you could do some learning with climate data. There is tons of it and it would serve a great purpose. Here is a link to lots of data from NOAA..
3145 said...
I would say filter images by face only or smthg like that, if that works based on that system I'm sooooo impressed.
Dan said...
The suggestion that small sample sizes are inadequate for machine learning might be a bit misleading. Human and animal neuronal systems easily learn complex categorization tasks in a very small number of trials. Humans and animals cannot live long enough to be exposed to billions of learning trials. Typically, learning asymptotes in accuracy in classical reinforcement studies of category learning in pigeons in fewer than a thousand trials. See for example the famous Cerella (1980) study where pigeons learned to classify whether Charlie Brown was in complex cartoon pictures with many different Peanuts characters and scenes with 95% accuracy in 800 learning trials. Charlie Brown was actually in about 400 of these learning trials in this case. While it may be true that semi-supervised learning based upon small numbers of labeled trials such as one labeled event does not generally work very well, this Google researcher needs to be aware of what is possible in supervised learning based upon animal learning studies (unless he wants to reinvent the wheel) and then he needs to be aware of newer developments in supervised machine learning such as the Generalized Reduced Error Logistic Regression Machine (RELR). Learning can easily asymptote in accuracy in RELR in the same number of trials as is typically seen in classical reinforcement studies in animals. More importantly, these RELR models are simple, interpretable, and highly accurate models that do not exhibit the black-box character of complex machine learning paradigms.
Ronald said...
Learning is the self organization of data. You seem to build the usual recognition engine. Recognition is not learning. Learning includes the building of abstracts or generalizations by the machine, recognition does not.
Tadej said...
Any possibility of telling us about the mathematics of the concrete underlying method that exhibits those properties? Possibly a future paper?
Dan said...
I would disagree with Ronald's comments about recognition not requiring abstraction. I recognize a dog even when it has only three legs; so could any accurate machine learning algorithm. This category of dog can be learned through some form of supervised learning that tells me the probability that certain combinations of features predict a certain category. The fact that it is probabilistic allows for the abstraction and generalization. Ronald's definition of learning would seem to ignore the vast majority of what is considered learning - that is supervised learning. Clearly humans and animals can learn categories very quickly through supervised learning and this does not require a billion learning trials even when the number of potential features is very large such as in millions of potential features that would arise through all the interactions between features seen through large numbers of Peanuts cartoon strips.
Ronald said...
Sorry Dan, the system ate my long response, had to run away a few times. Anyway what it really boils down to. Recognition is not learning, but learning includes recognition. Try to teach your system math and have it use it independently. Now teach it one-two-many math(math not equal math, its culture dependent) as best as we Westerners can understand it, no change in anything. Or for starters, teach it "all", What kind of abstraction does it build and use on its own? Think about how a brain layer/region does not feed back data to the layer it received data from. Or how to build decomposition with stochastic behavior. From where I stand it will be hard pressed without its own data organization. But I agree Google is way behind.
Dan said...
Ronald, I agree with you that there are limits to what passive, supervised learning systems can do. For any more natural learning of higher cognitive concepts, I believe that a form of active learning would be required. Yet, we are at a point in this field where we need to have a reasonable model for the “engram” before we can build massively parallel and distributed systems that have higher cognitive capabilities anywhere close to humans or even simple animals, such as pigeons. I actually believe that Google’s basic proposal for a massively parallel machine learning system is probably on the cutting edge in all areas except that they lack a reasonable model for this “engram”. The brain’s engrams are distributed representations for the fundamental categories, words, objects etc. that form cognition, but the brain’s engrams do not require a billion learning trials to be formed. My suggestion is that rather than immediately dismiss small sample size learning as a “statistical miracle”, they may wish to view this as something that a natural system like the brain must do in its engram formation. Once they open their minds to this possibility and learn about an algorithm such as RELR that does not arbitrarily impose L1/L2 regularization to achieve this, they may also be surprised that this is not a “statistical miracle”.
JezC said...
I'm expecting that this is used for AdWords Broad Matching and possibly organic ranking; language processing to create conceptually relationships? In AdWords, there's a fairly obvious feedback for machine learning - more clicks in response to better selections of adverts, and this would need large samples because of irregular user behaviour in response to adverts.
Ronald said...
Dan, depends all on what one wants to do. If one wants to analyze text, I would go with an self organizing system. Since it can learn the ambiguous structure of human language. For example: I use TTL (Time To Live) for analysis. In other words the system is "none" numerical and doesn't phrase text, it uses flow in time to associate differences in structure with meaning. Like: "I see" and "See I", have a different flow in time and a different meaning and the system can easily organize that. Think about it as columns (pronunciation) on a pane over time, looks like a wave in 3d (except it can/will twist and turn in any direction). Or why can a Magpie(bird, really different brain structure) recognize "self" from a mirror but not from a picture. What data does the mirror present a picture does not? I would say space timing info. In other words, most if not all higher cognitive functions can be presented and tested as space timing models. Including math and learning what "see" means. Would I use it to analyze global warming data, I don't think so.
amanfromMars said...
Alex said...
PROs: -they have found some good principles which are simple enough to make users interact in a meaningful way with the system - these principles are general enough to not cause ackward procedures to deal with some subsets of data - they have defined what is better not to deal with in the implementation of the system. DOUBTs: - it seems like a "brute force attack" approach, leveraging on google massive computational power when dealing with highly parallelized algorithms. - there is not emphasis on how to deal with the sparsity nature of categorization. -Knowledge and categories are units of information with specific boundaries which must be updated as new data comes in. - how about clustering information in schematas with prototypes, with multiple hierarchies based on the "domain" or context at hand? Clustering is the only way to deal with sparsity. Schemata must be organized from general concepts to more specific ones. - The human brain analyses patterns with huge parallelism, it also correlates toghether schemas with a concurrent high degree of belief in a distibuted way. However, when schematas are evaluated, merged or redefined, the brain retreives and process information in a more sequential fashion relying heavily on hierarchies among schemas.
Kumar said...
+1 for Mr. Wizard. Please give some examples of the kinds of problems that this massively scalable machine learning system solves in much better ways than whatever other approaches in use. Without that, I'm not sure what I'd gain by reading this research post.
dinesh said...
@ dan @ ronald You may find this recent post on The Noisy Channel about Information Retrieval using a Bayesian Model of Learning and Generalization interesting: (
Ronald said...
The problem I have with bayesian systems is. They try to avoid basic cell behavior instead of taking advantage of it. Simple example, cell behavior is: Stochastic subjective to exhaustion myelin sheath BAC . to name a few. Now if we want to do decomposition in a "fixed" connected network. Which basically requires specifics(n1)-> generalization(gn)-> specifics(gn...) If we use stochastic and exhaustive behavior we can try different specifics(gn). If we combine this with the myelin sheath and BAC we can introduce deterministic behavior. All of this is missing form your "normal" bayesian system, that some people associate with intelligent. Yet the real system does just that.
Dmitry Chichkov said...
Any plans to release it to the public? It looks like Microsoft had released its own toolkit (SIGMA: Large-Scale and Parallel Machine-Learning Tool Kit).
Mitu said...
Continuous Signal and Linear System