Your Data Zen starts here.

Science behind The Pictures

I get this question a lot: “Nice picture, but what does it actually mean? What’s the benefit for my company /research /process?” 

My answer to this is somewhat broader. Join me for this journey!

Note: Picture above is in fact a Shipping Web application, how is it seen by the Google bots.

Visualization and contextual processing of information is crucial for whole learning process.
More are we able to use abstraction in our critical thinking, more we can learn.

All IT Security, Science in general and also world of business provides us with a lot of information to digest. Without proper contextual processing, we might be driven to wrong conclusions with huge irreversible impact.

Human brain is evolved in a way that it can accept huge amount of data in visual form.

It is how we evolved and even blind people do store information in their brain same way as it would be accepted via their sight.

That is why visualization has so big power and enhances people who work with that to be extremely efficient.

Examples of enhancements:

  • Anomaly: Using our sight, we can spot anomaly in visualized data immediately. Just try to spot anomaly in huge spread sheet of numbers, just reading one by another trying to not fall asleep.
  • Patterns detection: We are evolved to quickly find patterns in visual inputs. Is it moving? Does it seem nice or danger? What is the relation for other data?
  • Memory: By remembering whole picture we can quickly recall even very complex memories and data structures. It is also because of the fact that visually interesting subject might make us to feel an emotion related to that.  Again, try to remember set of numbers. And then imagine them as shapes and colors and try again. Its much easier.

All right, so what does the pictures actually mean?

Using various data manipulation I have generated something, which can be used also as semantic map for given subject.

Very similar structures are created with certain abstraction given by general biology in our brain during learning process.

In computer science, similar approach is used to built so called clusters and big data in general. Also, it is basic requirement for building an AI – artificial intelligence.

Semantic map is used to build schema for database which suggests you restaurants, music you might like, business contacts and even dates on Tinder;-)


Here starts the geeky text, feel free to skip it:
So I have used use cases I am familiar with and produced semantic maps for them. To do so, usually it is needed to clean up the data, anonymize data set and sources if necessary.

Sometimes, the relationships are obvious like in case of web/twitter maps. Sometimes it is necessary to use some statistical methods to determine importance of relation between each class. Typically in biological networks and in case data are “noisy”.

Each phase of an analysis requires to determine:

  • The method, how was data collected in each use case. Was it reproducible? (How) does the collection method affect the structure of the output map?
  • The algorithm used for pulling the data. What is the performance? How is the algorithm prone to errors? How difficult is to spot the error?
  • Is it possible to automate above steps and maybe to scale them both to simple one time activity and to professional – industry ready process? What would be the best technology?
  • How complex is the output structure? Does complexity depends duration of the collection/ processing? If so, what is the optimal threshold?
  • What is the best visualization method?
  • .. and many more.

Finally – so what is the meaning and how to read the pictures:

  • Seek patterns in shape drawn by the edges.
    Using different methods to draw the map, the shapes might be different but the structure of connections is always the same. Every picture is therefore created to highlight certain fact imposed by the data set.
  • Modularity of the Edges – usually I use color to draw citizenship of an edge to certain “community” in the map. Example of community: colleagues from same office, friends who talk to each other more often than any other person in the room.
    Every community in the map has certain meaning.
  • How are the communities connected – centrality of communities and single nodes. Might spot for example teams who cooperate more closely within the company.
  • Nodes: Betweenes and degree of connection. Erdosz number. These are all terms from Social Network analysis (SNA). More connected node might impose the leader of the team, or single point of failure in business process.
  • Sometimes I do even include labels of the nodes. This personalizes whole content and provides very detailed information.

Actually, all above mentioned parameters and much more can be evaluated to work with data on required level of precision using mathematical methods.
So it might sound like look and feel only, but there is a lot of maths, statistics, strategy and sociology in behind.

Do you have still questions? Reach me!

© 2020 4n6strider

Theme by Anders Norén