Saturday, January 30, 2016

Scientific visualization

Scientific visualization

A scientific visualization of a simulation of a Rayleigh–Taylor instability caused by two mixing fluids.[1]
Surface rendering of Arabidopsis thaliana pollen grains with confocal microscope.

Scientific visualization (also spelled scientific visualisation) is an interdisciplinary branch of science. According to Friendly (2008), it is "primarily concerned with the visualization of three-dimensional phenomena (architectural, meteorological, medical, biological, etc.), where the emphasis is on realistic renderings of volumes, surfaces, illumination sources, and so forth, perhaps with a dynamic (time) component".[2] It is also considered a branch of computer science that is a subset of computer graphics. The purpose of scientific visualization is to graphically illustrate scientific data to enable scientists to understand, illustrate, and glean insight from their data.

Contents

1 History
2 Methods for visualizing two-dimensional data sets
3 Methods for visualizing three-dimensional data sets
4 Scientific visualization topics
4.1 Computer animation
4.2 Computer simulation
4.3 Information visualization
4.4 Interface technology and perception
4.5 Surface rendering
4.6 Volume rendering
4.7 Volume visualization
5 Scientific visualization applications
5.1 In the natural sciences
5.2 In geography and ecology
5.3 In mathematics
5.4 In the formal sciences
5.5 In the applied sciences
6 Scientific visualization organizations
7 See also
8 References
9 Further reading
10 External links

History
Charles Minard's flow map of Napoleon’s March.

One of the earliest examples of three-dimensional scientific visualisation was Maxwell's thermodynamic surface, sculpted in clay in 1874 by James Clerk Maxwell.[3] This prefigured modern scientific visualization techniques that use computer graphics.[4]

Notable early two-dimensional examples include the flow map of Napoleon’s March on Moscow produced by Charles Joseph Minard in 1869;[2] the “coxcombs” used by Florence Nightingale in 1857 as part of a campaign to improve sanitary conditions in the British army;[2] and the dot map used by John Snow in 1855 to visualise the Broad Street cholera outbreak.[2]
Methods for visualizing two-dimensional data sets

Scientific visualization using computer graphics gained in popularity as graphics matured. Primary applications were scalar fields and vector fields from computer simulations and also measured data. The primary methods for visualizing two-dimensional (2D) scalar fields are color mapping and drawing contour lines. 2D vector fields are visualized using glyphs and streamlines or line integral convolution methods. 2D tensor fields are often resolved to a vector field by using one of the two eigenvectors to represent the tensor each point in the field and then visualized using vector field visualization methods.
Methods for visualizing three-dimensional data sets

For 3D scalar fields the primary methods are volume rendering and isosurfaces. Methods for visualizing vector fields include glyphs (graphical icons) such as arrows, streamlines and streaklines, particle tracing, line integral convolution (LIC) and topological methods. Later, visualization techniques such as hyperstreamlines[5] were developed to visualize 2D and 3D tensor fields.
Scientific visualization topics
Maximum intensity projection (MIP) of a whole body PET scan.
Solar system image of the main asteroid belt and the Trojan asteroids.
Scientific visualization of Fluid Flow: Surface waves in water
Chemical imaging of a simultaneous release of SF6 and NH3.
Topographic scan of a glass surface by an Atomic force microscope.
Computer animation

Computer animation is the art, technique, and science of creating moving images via the use of computers. It is becoming more common to be created by means of 3D computer graphics, though 2D computer graphics are still widely used for stylistic, low bandwidth, and faster real-time rendering needs. Sometimes the target of the animation is the computer itself, but sometimes the target is another medium, such as film. It is also referred to as CGI (Computer-generated imagery or computer-generated imaging), especially when used in films.
Computer simulation

Computer simulation is a computer program, or network of computers, that attempts to simulate an abstract model of a particular system. Computer simulations have become a useful part of mathematical modelling of many natural systems in physics, and computational physics, chemistry and biology; human systems in economics, psychology, and social science; and in the process of engineering and new technology, to gain insight into the operation of those systems, or to observe their behavior.[6] The simultaneous visualization and simulation of a system is called visulation.

Computer simulations vary from computer programs that run a few minutes, to network-based groups of computers running for hours, to ongoing simulations that run for months. The scale of events being simulated by computer simulations has far exceeded anything possible (or perhaps even imaginable) using the traditional paper-and-pencil mathematical modeling: over 10 years ago, a desert-battle simulation, of one force invading another, involved the modeling of 66,239 tanks, trucks and other vehicles on simulated terrain around Kuwait, using multiple supercomputers in the DoD High Performance Computer Modernization Program.[7]
Information visualization

Information visualization is the study of "the visual representation of large-scale collections of non-numerical information, such as files and lines of code in software systems, library and bibliographic databases, networks of relations on the internet, and so forth".[2]

Information visualization focused on the creation of approaches for conveying abstract information in intuitive ways. Visual representations and interaction techniques take advantage of the human eye’s broad bandwidth pathway into the mind to allow users to see, explore, and understand large amounts of information at once.[8] The key difference between scientific visualization and information visualization is that information visualization is often applied to data that is not generated by scientific inquiry. Some examples are graphical representations of data for business, government, news and social media.
Interface technology and perception

Interface technology and perception shows how new interfaces and a better understanding of underlying perceptual issues create new opportunities for the scientific visualization community.[9]
Surface rendering

Rendering is the process of generating an image from a model, by means of computer programs. The model is a description of three-dimensional objects in a strictly defined language or data structure. It would contain geometry, viewpoint, texture, lighting, and shading information. The image is a digital image or raster graphics image. The term may be by analogy with an "artist's rendering" of a scene. 'Rendering' is also used to describe the process of calculating effects in a video editing file to produce final video output. Important rendering techniques are:

Scanline rendering and rasterisation
A high-level representation of an image necessarily contains elements in a different domain from pixels. These elements are referred to as primitives. In a schematic drawing, for instance, line segments and curves might be primitives. In a graphical user interface, windows and buttons might be the primitives. In 3D rendering, triangles and polygons in space might be primitives.

Ray casting
Ray casting is primarily used for realtime simulations, such as those used in 3D computer games and cartoon animations, where detail is not important, or where it is more efficient to manually fake the details in order to obtain better performance in the computational stage. This is usually the case when a large number of frames need to be animated. The resulting surfaces have a characteristic 'flat' appearance when no additional tricks are used, as if objects in the scene were all painted with matte finish.

Radiosity
Radiosity, also known as Global Illumination, is a method that attempts to simulate the way in which directly illuminated surfaces act as indirect light sources that illuminate other surfaces. This produces more realistic shading and seems to better capture the 'ambience' of an indoor scene. A classic example is the way that shadows 'hug' the corners of rooms.

Ray tracing
Ray tracing is an extension of the same technique developed in scanline rendering and ray casting. Like those, it handles complicated objects well, and the objects may be described mathematically. Unlike scanline and casting, ray tracing is almost always a Monte Carlo technique, that is one based on averaging a number of randomly generated samples from a model.

Volume rendering

Volume rendering is a technique used to display a 2D projection of a 3D discretely sampled data set. A typical 3D data set is a group of 2D slice images acquired by a CT or MRI scanner. Usually these are acquired in a regular pattern (e.g., one slice every millimeter) and usually have a regular number of image pixels in a regular pattern. This is an example of a regular volumetric grid, with each volume element, or voxel represented by a single value that is obtained by sampling the immediate area surrounding the voxel.
Volume visualization

According to Rosenblum (1994) "volume visualization examines a set of techniques that allows viewing an object without mathematically representing the other surface. Initially used in medical imaging, volume visualization has become an essential technique for many sciences, portraying phenomena become an essential technique such as clouds, water flows, and molecular and biological structure. Many volume visualization algorithms are computationally expensive and demand large data storage. Advances in hardware and software are generalizing volume visualization as well as real time performances".[9]
Scientific visualization applications

This section will give a series of examples how scientific visualization can be applied today.[10]
In the natural sciences

Star formation[11]

Gravitational waves[12]

Massive Star Supernovae Explosions

Molecular rendering

Star formation: The featured plot is a Volume plot of the logarithm of gas/dust density in an Enzo star and galaxy simulation. Regions of high density are white while less dense regions are more blue and also more transparent.

Gravitational waves: Researchers used the Globus Toolkit to harness the power of multiple supercomputers to simulate the gravitational effects of black-hole collisions.

Massive Star Supernovae Explosions: In the image, three-Dimensional Radiation Hydrodynamics Calculations of Massive Star Supernovae Explosions The DJEHUTY stellar evolution code was used to calculate the explosion of SN 1987A model in three dimensions.

Molecular rendering: VisIt's general plotting capabilities were used to create the molecular rendering shown in the featured visualization. The original data was taken from the Protein Data Bank and turned into a VTK file before rendering.
In geography and ecology

Terrain rendering

Climate visualization[13]

Atmospheric Anomaly in Times Square

Terrain visualization: VisIt can read several file formats common in the field of Geographic Information Systems (GIS), allowing one to plot raster data such as terrain data in visualizations. The featured image shows a plot of a DEM dataset containing mountainous areas near Dunsmuir, CA. Elevation lines are added to the plot to help delineate changes in elevation.

Tornado Simulation: This image was created from data generated by a tornado simulation calculated on NCSA's IBM p690 computing cluster. High-definition television animations of the storm produced at NCSA were included in an episode of the PBS television series NOVA called "Hunt for the Supertwister." The tornado is shown by spheres that are colored according to pressure; orange and blue tubes represent the rising and falling airflow around the tornado.

Climate visualization: This visualization depicts the carbon dioxide from various sources that are advected individually as tracers in the atmosphere model. Carbon dioxide from the ocean is shown as plumes during February 1900.

Atmospheric Anomaly in Times Square In the image the results from the SAMRAI simulation framework of an atmospheric anomaly in and around Times Square are visualized.
File:Tesseract.ogvPlay media
View of a 4D cube projected into 3D: orthogonal projection (left) and perspective projection (right).
In mathematics

Scientific visualization of mathematical structures has been undertaken for purposes of building intuition and for aiding the forming of mental models.[14]

Higher-dimensional objects can be visualized in form of projections (views) in lower dimensions. In particular, 4-dimensional objects are visualized by means of projection in three dimensions. The lower-dimensional projections of higher-dimensional objects can be used for purposes of virtual object manipulation, allowing 3D objects to be manipulated by operations performed in 2D,[15] and 4D objects by interactions performed in 3D.[16]
In the formal sciences

Curve plots

Image annotations

Scatter plot

Computer mapping of topographical surfaces: Through computer mapping of topographical surfaces, mathematicians can test theories of how materials will change when stressed. The imaging is part of the work on the NSF-funded Electronic Visualization Laboratory at the University of Illinois at Chicago.

Curve plots: VisIt can plot curves from data read from files and it can be used to extract and plot curve data from higher-dimensional datasets using lineout operators or queries. The curves in the featured image correspond to elevation data along lines drawn on DEM data and were created with the feature lineout capability. Lineout allows you to interactively draw a line, which specifies a path for data extraction. The resulting data was then plotted as curves.

Image annotations: The featured plot shows Leaf Area Index (LAI), a measure of global vegetative matter, from a NetCDF dataset. The primary plot is the large plot at the bottom, which shows the LAI for the whole world. The plots on top are actually annotations that contain images generated earlier. Image annotations can be used to include material that enhances a visualization such as auxiliary plots, images of experimental data, project logos, etc.

Scatter plot: VisIt's Scatter plot allows to visualize multivariate data of up to four dimensions. The Scatter plot takes multiple scalar variables and uses them for different axes in phase space. The different variables are combined to form coordinates in the phase space and they are displayed using glyphs and colored using another scalar variable.
In the applied sciences

Porsche 911 model

YF-17 aircraft Plot

City rendering

Porsche 911 model (NASTRAN model): The featured plot contains a Mesh plot of a Porsche 911 model imported from a NASTRAN bulk data file. VisIt can read a limited subset of NASTRAN bulk data files, in general enough to import model geometry for visualization.

YF-17 aircraft Plot: The featured image displays plots of a CGNS dataset representing a YF-17 jet aircraft. The dataset consists of an unstructured grid with solution. The image was created by using a pseudocolor plot of the dataset's Mach variable, a Mesh plot of the grid, and Vector plot of a slice through the Velocity field.

City rendering: An ESRI shapefile containing a polygonal description of the building footprints was read in and then the polygons were resampled onto a rectilinear grid, which was extruded into the featured cityscape.

Inbound traffic measured: This image is a visualization study of inbound traffic measured in billions of bytes on the NSFNET T1 backbone for the month of September 1991. The traffic volume range is depicted from purple (zero bytes) to white (100 billion bytes). It represents data collected by Merit Network, Inc.[17]
Scientific visualization organizations

Important laboratory in the field are:

Electronic Visualization Laboratory
NASA Goddard Scientific Visualization Studio.[18]

Conferences in this field, ranked by significance in scientific visualization research, are:

IEEE Visualization
EuroVis
SIGGRAPH
Eurographics
Graphicon

See further: Category:Computer graphics organizations
See also
Portal icon Computer science portal
Portal icon Science portal

General

ACM Transactions on Graphics
Data Presentation Architecture
Data visualization
Mathematical visualization
Molecular graphics
Skin friction line
Tensor glyph
Visulation
Visual analytics

People

Tristan Needham

Software

Avizo
Baudline
Bitplane
Datacopia
Dataplot
DataMelt
DeDaLo
MeVisLab
NCAR Command Language
Orange
ParaView
Sirius visualization software
Tecplot
tomviz
VAPOR
Vis5D
VisAD
VisIt
VTK
Category:Free data visualization software

References

Visualizations that have been created with VisIt. at wci.llnl.gov. Updated: November 8, 2007
Michael Friendly (2008). "Milestones in the history of thematic cartography, statistical graphics, and data visualization".
James Clerk Maxwell and P. M. Harman (2002), The Scientific Letters and Papers of James Clerk Maxwell, Volume 3; 1874–1879, Cambridge University Press, ISBN 0-521-25627-5, p. 148.
Thomas G.West (February 1999). "James Clerk Maxwell, Working in Wet Clay". SIGGRAPH Computer Graphics Newsletter 33 (1): 15–17. doi:10.1145/563666.563671.
Delmarcelle, T; Hesselink, L. (1993). "Visualizing second-order tensor fields with hyperstreamlines". Computer Graphics and Applications , IEEE 13 (4).
Steven Strogatz (2007). "The End of Insight". In: What is your dangerous idea? John Brockman (ed). HarperCollins.
"Researchers stage largest military simulation ever". (news), Jet Propulsion Laboratory, Caltech, December 1997.
James J. Thomas and Kristin A. Cook (Ed.) (2005). Illuminating the Path: The R&D Agenda for Visual Analytics. National Visualization and Analytics Center. p.30
Lawrence J. Rosenblum (ed.) (1994). Scientific Visualization: Advances and challenges. Academic Press.
All examples both images and text here, unless another source is given, are from the Lawrence Livermore National Laboratory (LLNL), from the LLNL website, Retrieved 10–11 July 2008.
The data used to make this image were provided by Tom Abel Ph.D. and Matthew Turk of the Kavli Institute for Particle Astrophysics and Cosmology.
BLACK-HOLE COLLISIONS The Globus software creators Ian Foster, Carl Kesselman and Steve Tuecke. Publication Summer 2002.
Image courtesy of Forrest Hoffman and Jamison Daniel of Oak Ridge National Laboratory
Andrew J. Hanson, Tamara Munzner, George Francis: Interactive methods for visualizable geometry, Computer, vol. 27, no. 7, pp. 73–83 (abstract)
A. J. Hanson: Constrained 3D navigation with 2D controller, Visualization '97., Proceedings, 24 October 1997, pp. 175-182 (abstract)
Hui Zhang, Andrew J. Hanson: Shadow-Driven 4D Haptic Visualization, IEEE Transactions on Visualization and Computer Graphics, vol. 13, no. 6, pp. 1688-1695 (abstract)
Image by Donna Cox and Robert Patterson. The National Science Foundation Press Release 08-112.

NASA Goddard Scientific Visualization Studio

Further reading

Bruce H. McCormick, Thomas A. DeFanti and Maxine D. Brown (eds.) (1987). Visualization in Scientific Computing. ACM Press.
Gregory M. Nielson, Hans Hagen and Heinrich Müller (1997). Scientific Visualization: Overviews, Methodologies, and Techniques. IEEE Computer Society.
Clifford A. Pickover (ed.) (1994). Frontiers of Scientific Visualization. New York: John Willey Inc.
Lawrence J. Rosenblum (ed.) (1994). Scientific Visualization: Advances and challenges. Academic Press.
Will Schroeder, Ken Martin, Bill Lorensen (2003). The Visualization Toolkit. Kitware, Inc.
Leland Wilkinson (2005). The Grammar of Graphics, Springer.
Paolo Ciuccarelli, Giorgia Lupi, Luca Simeone (2014) Visualizing the Data City: Social Media as a Source of Knowledge for Urban Planning and Management"', Springer

External links
Wikimedia Commons has media related to Scientific visualization.

National Institute of Standards and Technology Scientific Visualizations, with an overview of applications.
Scientific Visualization Tutorials, Georgia Tech
NASA Scientific Visualization Studio. They facilitate scientific inquiry and outreach within NASA programs through visualization.
scienceviz.com - Scientific Vizualisation, Simulation and CG Animation for Universities, Architects and Engineers

Visualizing Data Mining Models

Visualizing Data Mining Models

1. Introduction

The point of data visualization is to let the user understand what is going on. Since data mining usually involves extracting "hidden" information from a database, this understanding process can get somewhat complicated. In most standard database operations nearly everything the user sees is something that they knew existed in the database already. A report showing the breakdown of sales by product and region is straightforward for the user to understand because they intuitively know that this kind of information already exists in the database. If the company sells different products in different regions of the county, there is no problem translating a display of this information into a relevant understanding of the business process.

Data mining, on the other hand, extracts information from a database that the user did not already know about. Useful relationships between variables that are non-intuitive are the jewels that data mining hopes to locate. Since the user does not know beforehand what the data mining process has discovered, it is a much bigger leap to take the output of the system and translate it into an actionable solution to a business problem. Since there are usually many ways to graphically represent a model, the visualizations that are used should be chosen to maximize the value to the viewer. This requires that we understand the viewer's needs and design the visualization with that end-user in mind. If we assume that the viewer is an expert in the subject area but not data modeling, we must translate the model into a more natural representation for them. For this purpose we suggest the use of orienteering principles as a template for our visualizations.

1.1 Orienteering

Orienteering is typically accomplished by two chief approaches: maps and landmarks. Imagine yourself set down in an unknown city with instructions to find a given hotel. The usual method is to obtain a map showing the large-scale areas of the city. Once the "hotel district" is located we will then walk along looking for landmarks such as street names until we arrive at our location. If the landmarks do not match the map, we will re-consult the map and even replace one map with another. If the landmarks do not appear correct then usually one will backtrack, try a short side journey, or ask for further landmarks from people on the street. The degree to which we will follow the landmark chain or trust the map depends upon the match between the landmarks and the map. It will be reinforced by unexpected matches (happening along a unique landmark for which we were not looking), by finding the landmark by two different routes and by noting that variations are small. Additionally, our experience with cities and maps and the urgency of our journey will affect our confidence as well.

The combination of a global coordinate system (the map analogy) and the local coordinate system (the landmarks) must fit together and must instill confidence as the journey is traversed. The concept of a manifold is relevant in that the global coordinates must be realizable, as a combination of local coordinate systems is some sense. To grow trust in the user we should:

Show that nearby paths (small distances in the model) do not lead to widely different ends
Show, on demand, the effect that different perspectives (change of variables or inclusion probabilities) have on model structure
Make dynamic changes in coloring, shading, edge definition and viewpoint (dynamic dithering)
Sprinkle known relationships (landmarks) throughout the model landscape.
Allow interaction that provides more detail and answers queries on demand.

The advantages of this manifold approach include the ability to explore it in some optimal way (such as projection pursuit), the ability to reduce the models to a independent coordinate set, and the ability to measure model adequacy in a more natural manner.

1.2 Why Visualize a Data Mining Model?

The driving forces behind visualizing data mining models can be broken down into two key areas: Understanding and Trust. Understanding is undoubtedly the most fundamental motivation behind visualizing the model. Although the simplest way to deal with a data mining model is to leave the output in the form of a black box, the user will not necessarily gain an understanding of the underlying behavior in which they are interested. If they take the black box model and score a database, they can get a list of customers to target (send them a catalog, increase their credit limit, etc.). There’s not much for the user to do other than sit back and watch the envelopes go out. This can be a very effective approach. Mailing costs can often be reduced by an order of magnitude without significantly reducing the response rate.

The more interesting way to use a data mining model is to get the user to actually understand what is going on so that they can take action directly. Visualizing a model should allow a user to discuss and explain the logic behind the model with colleagues, customers, and other users. Getting buy-in on the logic or rationale is part of building the users’ trust in the results. For example, if the user is responsible for ordering a print advertising campaign, understanding customer demographics is critical. Decisions about where to put advertising dollars are a direct result of understanding data mining models of customer behavior. There’s no automated way to do this. It’s all in the marketing manager’s head. Unless the output of the data mining system can be understood qualitatively, it won’t be of any use. In addition, the model needs to be understood so that the actions that are taken as a result can be justified to others.

Understanding means more than just comprehension; it also involves context. If the user can understand what has been discovered in the context of their business issues, they will trust it and put it into use. There are two parts to this problem: 1) visualization of the data mining output in a meaningful way, and 2) allowing the user to interact with the visualization so that simple questions can be answered. Creative solutions to the first part have recently been incorporated into a number of commercial data mining products (such as MineSet [1]). Graphing lift, response, and (probably most importantly) financial indicators (e.g., profit, cost, ROI) give the user a sense of context that can quickly ground the results in reality. After that, simple representations of the data mining results allow the user to see the data mining results. Graphically displaying a decision tree (CART, CHAID, and C4.5) can significantly change that way in which the data mining software is used. Some algorithms can pose more problems than others (e.g., neural networks) can but novel solutions are starting to appear.

It is the second part that has yet to be addressed fully. Interaction is, for many users, the Holy Grail of visualization in data mining. Manipulation of the data and viewing the results dynamically allows the user to get a feel for the dynamics and test whether something really counter-intuitive is going on. The interactivity helps achieve this and the easier this is to do the better. Seeing a decision tree is nice, but what they really want to do is drag-and-drop the best segments onto a map of the United States in order to see if there are sales regions that are neglected. The number of "what if" questions that can be asked is endless: How do the most likely customers break down by gender? What is the average balance for the predicted defaulters? What are the characteristics of mail order responders? The interaction will continue until the user understands what is going on with their customers. Users also often desire drill through so that they can see the actual data behind a model (or some piece of the model), although it is probably more a matter of perceptions rather than actual usefulness. Finally, integrating with other decision support tools (e.g., OLAP) will let users view the data mining results in a manner that they are already using for the purpose of understanding customer behavior. By incorporating interaction into the process, a user will be able to connect the data mining results with his or her customers.

2. Trusting the Model

Attributing the appropriate amount of trust to data mining models is essential to using them wisely. Good quantitative measures of "trust" must ultimately reflect the probability that the model’s predictions would match future test targets. However, due to the exploratory and large-scale nature of most data-mining tasks, fully articulating all of the probabilistic factors to do so would seem to be generally intractable. Thus, instead of focussing on trying to boil "trust" down to one probabilistic quantity, it is typically most useful to visualize along many dimensions some of the key factors that contribute to trust (and distrust) in ones models. Furthermore, since, as with any scientific model, one ultimately can only disprove a model, visualizing the limitations of the model is of prime importance. Indeed, one might best view the overall goal of "visualizing trust" as that of understanding the limitations of the model, as opposed to understanding the model itself.

Since data mining relies heavily on training data, it is important to understand the limitations that given data sets put on future application of the resulting model. One class of standard visualization tools involves probability density estimation and clustering over the training data. Especially interesting would be regions of state space that are uncommon in the training data yet do not violate known domain constraints. One would tend to trust a model less if it acts more confident when presented with uncommon data as future inputs. For time-series data, visualizing indicators of non-stationarity is also important.

2.1 Assessing Trust in a Model

Assessing model trustworthiness is typically much more straight-forward than the holy grail of model understanding per se — essentially because the former is largely deconstructive while the latter is constructive. For example, without a deep understanding of a given model, one can still use general domain knowledge to detect that it violates expected qualitative principles. A well-known example is that one would be concerned if ones model employed a (presumably spurious) statistic correlation between shoe size and IQ. Of course, there are still very significant challenges in declaring such knowledge as completely and consistently as possible.

Domain knowledge is also critical for outlier detection needed to clean data and avoid classic problems such as a juvenile crime committed by a 80-year-old "child". If a data mining model were build using the data in Figure 1, it is possible that outliers (most likely caused by incorrect data entry) will skew the resulting model (especially the zero-year-old children, which are more reasonable than eighty-year-old children). The common role of visualization here is mostly in terms of annotating model structures with domain knowledge that they violate.

Figure 1: Age (in months) vs. Days to Intake Decision for juvenile crime offenders, Maryland Department of Juvenile Services. Note the 80-year-old children on the right side of the graph.


Not all assessments of trust are negative in nature, however. In particular, one can also increase ones trust in a model if other reasonable models seem worse. In this sense, assessing trust is also closely related to model comparison. In particular, it is very useful to understand the sensitivity of model predictions and quality to changes in parameters and/or structure of the given model. There are many ways to visualize such sensitivity, often in terms of local and global (conditional) probability densities — with special interest in determining whether multiple modes of high probability exist for some parameters and combinations. Such relative measures of trust can be considerably less demanding to formulate than attempts at more absolute measures, but do place special demands on the visualization engine, which must support quick and non-disorientating navigation through neighboring regions in model space.

Statistical summaries of all sorts are also common and useful for gathering insights for assessing model trust. Pairwise scatter-plots and low-dimensional density estimates are especially common. Summaries can be particularly useful for comparing relative trust of two models, by allowing analysis to focus on subsets of features for which their interrelationships differ most significantly between two models.

It is often useful to combine summaries with interactive ability to drill-through to the actual data. Many forms of visual summary actually display multiple scales of data along the raw to abstract continuum, making visual drill-through a natural recursive operation. For example, compressing millions of samples into a time-series strip chart that is only 1000 pixels wide allows one to quickly see the global highest and lowest points across the entire time range, as well as the local high and low points occurring within each horizontal pixel.

Most useful are models that qualify their own trustworthiness to some degree, such as in quantifying the expected variance in the error of their predictions.

In practice, such models tend to be relatively rare. Heavy emphasis on expected case rather than worst case performance is generally not all that inappropriate, since one is typically ultimately interested in concepts such as expected cumulative payoff.

There are important classes of tasks, such as novelty detection (e.g. fraud detection), for which quantified variance is essential. Standard techniques are learning confidence intervals (e.g. error bars for neural networks) and general probability density estimation. A promising recent approach [2], called bounds estimation, attempts to find a balance between the complexity of general probability density estimation and the simplicity of the mean estimation plus variance estimation approach to error bars.

Finally, it is important, though rather rare in practice to date, to consider many transformations of the data during visual exploration of model sensitivities. For example, a model that robustly predicts well the internal pressure of some engineering device should probably also be able to do well predicting related quantities, such as its derivative, its power spectrum, and other relevant quantities (such as nearby or redundant pressures). Checking for such internal consistency is perhaps ultimately one of the most important ways to judge the trustworthiness of a model, beyond standard cross validation error. Automated and interactive means of exploring and visualizing the space (and degrees) of inconsistencies a model entails seems to be a particularly important direction for future research on assessing model trustworthiness.

3. Understanding the Model

A model that can be understood is a model that can be trusted. While statistical methods build some trust in a model by assessing its accuracy, they cannot assess the model’s semantic validity — its applicability to the real world.

A data mining algorithm that uses a human-understandable model can be checked easily by domain experts, providing much needed semantic validity to the model. Unfortunately, users are often forced to trade off accuracy of a model for understandability.

Advanced visualization techniques can greatly expand the range of models that can be understood by domain experts, thereby easing the accuracy/understandability trade-off. Three components are essential for understanding a model: representation, interaction, and integration. Representation refers to the visual form in which the model appears. A good representation displays the model in terms of visual components that are already familiar to the user. Interaction refers to the ability to see the model in action in real time, to let the user play with the model as if it were a machine. Integration refers to the ability to display relationships between the model and alternate views of the data on which it is based. Integration provides the user context.

The rest of this section will focus on understanding classification models. Specifically, we will examine three models built using Silicon Graphic’s MineSet: decision tree, simple Bayesian, and decision table classifiers [3]. Each of these tools provides a unique form of understanding based on representation, interaction, and integration.

The graphical representation should be simple enough to be easily understood, but complete enough to reveal all the information present in the model. This is a difficult balance since simplicity usually trades off against completeness. Three-dimensional visualizations have the potential to show far more information than two-dimensional visualizations while retaining their simplicity. Navigation in such a scene lets one focus on an element of interest while keeping the rest of the structure in context. It is critical, however, that the user be able to navigate through a three-dimensional visualization in real time. An image of a three-dimensional scene is merely a two-dimensional projection and is usually more difficult to understand than a scene built in two dimensions.

Even with three dimensions, many models still contain far too much information to display simply. In these cases the visualization must simplify the representation as it is displayed. The MineSet decision tree and decision table visualizers use the principle of hierarchical simplification to present a large amount of information to the user.

Decision trees are easy to understand but can become overwhelmingly large when automatically induced. The SGI MineSet Tree Visualizer uses a detail-hiding approach to simplify the visualization. In figure 2, only the first few levels of the tree are initially displayed, despite the fact that the tree is extensive. The user can gain a basic understanding of the tree by following the branches of these levels. Additional levels of detail are revealed only when the user navigates to a deeper level, providing more information only as needed.



Figure 2: The MineSet Tree Visualizer shows only the portion of the model close to the viewer.


Using decision tables as a model representation generates a simple but large model. A full decision table theoretically contains the entire dataset, which may be very large. Therefore simplification is essential. The MineSet decision table arranges the model into levels based on the importance of each feature in the table. The data is automatically aggregated to provide a summary using only the most important features. When the user desires more information, he can drill down as many levels as needed to answer his question. The visualization automatically changes the aggregation of the data to display the desired level of detail. In figure 3, a decision table shows the well-known correlation between head shape and body shape in the monk dataset. It also shows that the classification is ambiguous in cases where head shape does not equal body shape. For these cases, the user can drill down to see that the attribute jacket color determines the class.



Figure 3: The MineSet Decision Table Visualizer shows additional pairs of attributes as the user drills down into the model.


While a good representation can greatly aid the user’s understanding, in many cases the model contains too much information to provide a representation that is both complete and understandable. In these cases we exploit the brain’s ability to reason about cause and effect and let the user interact with the more complex model. Interaction can be thought of as "understanding by doing" as opposed to "understanding by seeing".

Common forms of interaction are interactive classification, interactive model building, drill-up, drill-down, animation, searching, filtering, and level-of-detail manipulation. The fundamental techniques of searching, filtering, drill-up, and drill-down, make the task of finding information hidden within a complex model easier. However, they do not help overall understanding much. More extensive techniques (interactive classification, interactive model building) are required to help the user understand a model which is too complicated to show with a static image or table. These advanced methods aid understanding by visually showing the answer to a user query while maintaining a simplified representation of the model for context.

The MineSet Evidence Visualizer allows the user to interact with a simple Bayesian classifier (Figure 4). Even simple Bayesian models are based on multiplying arrays of probabilities that are difficult to understand by themselves. However, by allowing the user to select values for features and see the effects, the visualization provides cause-and-effect insight into the operation of the classifier. The user can play with the model to understand exactly how much each feature affects the classification and ultimately decide to accept or reject the result. In the example in the figure, the user selects the value of "working class" to be "self-employed-incorporated," and the value of "education" to be "professional-school". The pie chart on the right displays the expected distribution of incomes for people with these characteristics.



Figure 4: Specific attribute values are selected in the Evidence Visualizer in order to predict income for people with those characteristics.


Beyond interactive classification, interactively guiding the model-building process provides additional control and understanding to the user. Angoss [4] provides a decision tree tool that gives the user full control over when and how the tree is built. The user may suggest splits, perform pruning, or manually construct sections of the tree. This facility can boost understanding greatly. Figure 5a shows a decision tree split on a car’s brand attribute. While the default behavior of the tree is to form a separate branch on the tree for each categorical value, a better approach is often to group similar values together and produces only a few branches. The result shown in figure 5b is easier to understand and can sometimes give better accuracy. Interactive models allow the user to make changes like this as the situation warrants.





Figures 5a and 5b: A decision tree having branches for every value of the brand attribute (top), and a decision tree which groups values of brand to produce a simpler structure (bottom).


Interactive techniques and simplified representations can produce models that can be understood within their own context. However, for a user to truly understand a model, he must understand how the model relates to the data from which it was derived. For this goal, tool integration is essential.

Few tools on the market today use integration techniques. The techniques that are used come in three forms: drill-through, brushing, and coordinated visualizations. Drill-through refers to the ability to select a piece of a model and gain access to the original data upon which that piece of the model was derived. For example, the decision tree visualizer allows selection and drill-through on individual branches of the tree. This will provide access to the original data that was used to construct those branches, leaving out the data represented by other parts of the tree. Brushing refers to the ability to select pieces of a model and have the selections appear in an alternate representation. Coordinated visualizations generalize both techniques by showing multiple representations of the same model, combined with representations of the original data. Interactive actions that affect the model also affect the other visualizations. All three of these techniques help the user understand how the model relates to the original data. This provides an external context for the model and helps establish semantic validity.

As data mining becomes more extensive in industry and as the number of automated techniques employed increases, there is a natural tendency for models to become increasingly complex. In order to prevent these models from becoming mysterious oracles, whose dictates must be accepted on faith, it is essential to develop more sophisticated visualization techniques to keep pace with the increasing model complexity. Otherwise there is a danger that we will make decisions without understanding the reasoning behind them.

4. Comparing Different Models using Visualization

Model comparison requires the creation of an appropriate metric for the space of models under consideration. To visualize the model comparison, these metrics must be interpretable by a human observer through his or her visual system. The first step is to create a mapping from input to output of the modeling process. The second step is to map this process to the human visual space.

4.1 Different Meanings of the Word "Model"

It is important to recognize that the word "model" can have several levels of meaning. Common usage often associates the word model with the data modeling process. For example, we might talk of applying a neural network model to a particular problem. In this case, the word model refers to the generic type of model known as a neural network. Another use of the word model is associated with the end result of the modeling process. In the neural network example, the model could be the specific set of weights, topology, and node types that produces an output given a set of inputs. In still another use, the word model refers to the input-output mapping associated with a "black-box." Such a mapping necessarily places emphasis on careful identification of the input and output spaces.

4.2 Comparing Models as Input-Output Mappings

The input-output approach to model comparison simply considers the mapping from a defined input space to a defined output space. For example, we might consider a specific 1-gigabyte database with twenty-five variables (columns). The input space is simply the Cartesian product of the database's twenty-five variables. Any actions inside the model, such as creation of new variables, are hidden in the "black-box" and are not interpreted. At the end of the modeling process, an output is generated. This output could be a number, a prioritized list or even a set of rules about the system. The crucial issue is that we can define the output space in some consistent manner to derive an input to output mapping.

It is the space generated by the mappings that is of primary importance to the model comparison. For most applications the mapping space will be well defined once the input and output spaces are well defined. For example, two classifiers could be described by a set of input/output pairs, such as (obs1, class a), (obs2, class b), etc. The comparison metric could then be defined on these pairs as a count of the number differing, or GINI indices, or classification cost, etc. The resulting set of pairs could be visualized by simple plotting of points on a two-dimensional graph. The two model could be indexed by coloring or symbol codes. Or one could focus on the difference between each model directly and plot this. This approach should prove adequate so long as we restrict attention to a well-defined input-output structure.

4.3 Comparing Models as Algorithms

In the view of a model as static algorithm, again there seems to be a reasonable way to approach the model comparison problem. For example, a neural network model and an adaptive nonlinear regression model might be compared. These models would be expressed as a series of algorithmic steps. Each model's algorithm could then be analyzed by standard methods for measurement of algorithmic performance such as complexity, the finite word length and the stability of the algorithm. The investigator could also include measures on the physical implementation of the algorithm such as computation time, or computation size. Using these metrics the visualization could take the form of bar charts across the metrics. Again, different models could be encoded by color or symbol, and a graph of only difference between the two models on each metric could be provided. Each comparison would be for a static snapshot but certainly dynamic behavior could be exploited through a series of snapshots, i.e. a motion picture.

4.4 Comparing Models as Processes

The view of the model as a process is the most ill defined and therefore most intractable of the three views, but this should not minimize its importance. Indeed its sheer complexity might make it the most important view for the application of visualization. It is precisely in this arena that we encounter the subject area expert for whom these systems should offer the most benefit (such as confidence and trust).

The modeling process includes everything in and around the modeling activity, such as the methods, the users, the database, the support resources, and constraints such as knowledge, time and analysis implementation. Clearly this scope is too large for us to consider. Let us narrow our scope by assuming that the model comparison is being applied for one user on one database over a short time period. This implies that user differences, database differences, and knowledge difference can be neglected. We are left with analysis methods and implementation issues. For most subject area experts the implementation and the analysis are not separable, and so we will make the additional assumption that this issue can be ignored as well. With these simplifying assumptions we are essentially defining model comparison to be the comparison of modeling method and implementation simultaneously.

Imagine two models that are available in some concrete implemented form. These could be different general methods such as neural networks versus tree-based classifiers, or they could be different levels of sophistication within a class of models such as CART versus CHAID tree-structures. Remember that we are now focusing only on the modeling process, and not its input/output or algorithmic structure. It seems that reasonable metrics can be defined in this situation. For example, the running time could be such a metric, or the interpretability of instructions, or the number of tuning parameters that must be chosen by the user at run-time. The key here is that these metrics must be tailored to the user who is the target of the application. Thus, whereas the input-output view focused on these the spaces, and the algorithmic view focused on the properties of the algorithm independently of the user, now we must focus in great detail on the user’s needs and perceptions.

Once a set of metrics are chosen, we appear to be in a similar situation to that described under the algorithmic comparison. We should be able to show the distances between models in each of the defined metrics in a bar chart or other standard display. Color or symbol coding can be used to show the results from each model on the same chart as well.

There will be many possible metrics for the model-building process, at least one per user. Since it is unlikely we can choose a set of "one-size-fits-all" metrics, it is more useful to establish properties of good metrics and create methods to establish them in novel situations. The metrics chosen by a academic researcher would likely be very different from those chosen business user. Some properties that good metrics for the modeling process should be:

That they are expressed in direct risk/benefit to user.
That they evaluate their sensitivity to model input and assumptions.
That they can be audited (open to questioning at any point).
That they are dynamic.
That they can be summarized in the sense of an overall map.
That they allow reference to landmarks and markers.

Some aspects of the visualization process will take on added importance. One such aspect is the sequential behavior of the modeling process. For example, it is common to plot frequently the updated fit between the data and the model predictions as a neural network learns. A human being will probably give more trust to a method which mimics his or her own learning behavior (i.e., a learning curve which starts with a few isolated details, then grows quickly to broad generalizations and then makes only incremental gains after that in the typical "S" shape). Unstable behavior or large swings should count against the modeling process.

Another aspect of importance should a visual track of the sensitivity of the modeling process to small changes in the data and modeling process parameters. For example, one might make several random starts with different random weights in a neural network model. These should be plotted versus one another showing their convergence patterns, again perhaps against a theoretical S-shaped convergence.

The model must also be auditable, meaning that inquiries may be made at any reasonable place in the modeling process. For a neural network we should be able to interrupt it and examine individual weights at any step in the modeling process. Likewise for a tree-based model we should be able to see subtrees at will. Ideally there could be several scales in which this interruption could occur.

Since most humans operate on a system of local and global coordinates it will be important to be able to supplement the visualizations with markers and a general map structure. For example, even though the direct comparison is between two neural nets with different structures, it would be good to have the same distances plotted for another method with which the user is familiar (like discriminant analysis) even if that method is inadequate. If the same model could be used on a known input, the user could establish trust with the new results. It might also be useful to have simultaneously a detailed and a summarized model displayed. For example, the full tree-based classifier might have twenty-five branches, but the summarized tree might show the broad limbs only. And if the output is a rule it might be useful to drive (through logical manipulation) other results or statements of results as a test of reasonableness.

5. Conclusion

In this paper we have discussed a number of methods to visualize data mining models. Because data mining models typically generate results that were previously unknown to the user, it is important that any model visualization provide the user with sufficient levels of understanding and trust.


References

C. Brunk, J. Kelly, and R. Kohavi, "MineSet: An Integrated System for Data Access, Visual Data Mining, and Analytical Data Mining," Proceedings of the Third Conference on Knowledge Discovery and Data Mining (KDD-97), Newport Beach, CA, August 1997. See also http://www.sgi.com/Products/software/MineSet

D. DeCoste, "Mining multivariate time-series sensor data to discover behavior envelopes," Proceedings of the Third Conference on Knowledge Discovery and Data Mining (KDD-97), Newport Beach, CA, August 1997.

[3] D. Rathjens, MineSet Users Guide, Silicon Graphics, Inc., 1997.

[4] See http://www.angoss.com.

Thursday, January 28, 2016

Data visualisation

Data visualisation is an integral part of data analysis and business intelligence. Explore the most recommended type of charts and good design tips to help you create powerful and persuasive graphs for decision making.
Data visualisation is the graphical display of abstract information for two purposes: sense-making (also called data analysis) and communication.
Few (2013)

Most modern organisations use numerical data to communicate quantitative information. These numbers are fundamental to understanding of organisational performance. This information can be presented in many different ways, for example graphs, maps, and at a more advanced level – dashboards.


Despite the popular wisdom, data and numbers cannot always speak for themselves. Sometimes, too much time can be spent on struggling to understand the data presented in lengthy reports and numerical tables. This time could be better spent on making evidence-based decisions.

Data visualisation can help with the analysis of that information and present it in a way that allows viewers to discover patterns that might otherwise be hard to uncover. Large amounts of data are hard to wade through, but data visualisation can make that data easily digestible.

Who is this resource for?
This guide will be of benefit to anyone interested in creating well designed, informative and easy to understand charts. Whether you are a student, researcher, lecturer or work in management it is likely that you will often need to include statistical information or analysis in your papers, reports and presentations.

Even if you are already well experienced and have progressed to building interactive web-based dashboards, you may still find it beneficial to refresh your understanding of current practice relating to good visualisation design. After all even the most advanced dashboards will contain a collection of individual graphs, maps or other types of visual displays such as traffic lights and speed dials, for example.

During the past two decades, we have seen an amazing progress in technologies enabling us to collect and process huge amounts of data. This vast data availability has driven the interest in data analysis and visualisation. This in turn has lead to visualisation methods being constantly updated and developed as new evidence about the effectiveness of visualisation methods emerge.

This guide is not intended to be an exhaustive guide to the subject in hand. There are too many good sources of information eg specialist books, blogs and publications dealing with the topic of data visualisation for us to recreate it all. Even amongst the experts the opinions vary on what should be the gold standard and best practice in this area. Instead, our guide intends to be a distillation of these opinions and advice – many of which have been tried and tested by us in practice – and to bring many useful resources together into one place.

The advice contained in this resource is applicable to data visualisation used in the business context, rather than the data art so commonly seen in the media and conference presentations. In the context of this resource, data art is visualisation of data that seeks primarily to entertain or produce an aesthetic experience.

Business intelligence guide
Organisations require access to accurate, timely and meaningful information about their core businesses and the environment in which they operate, if they are to adapt and thrive during times of great uncertainty.

Our guide on business intelligence helps to explore this essential element of decision-making based on accurate data about the state of your organisation and the environment in which it operates.

Benefits of data visualization

Benefits of data visualization
1. Data visualization is a complex set of processeswhich is like an umbrella that covers bothinformation and scientific visualizationsimultaneously. We can’t ignore the benefits of datavisualization for its accurate quantities, as it is easilycomparable. It also lends valuable suggestionpertaining to the usage of its technique and tools.Scientifically its effectiveness lies in our brainsability to maintain a proper balance betweenperception and cognition through visualization.
2. With the sudden increase of thousands ofcompanies with their product, theresponsibilities of data visualization is everon the increase as an essentialcomponent of business intelligence. This iswhy companies are hiring expertdesigners having visualization skills. Thesignificant messages in data arepresented in the patterns and its trends,gaps and outliers.
3. This is the most interesting part for which we aredragged into it and grip it firmly forcomprehending it more quickly rather than rawnumbers alone. Visualization is so powerful andeffective that it can change someone’s mind in aflash.One of the most important benefits ofvisualization is that it encompasses various dataset quickly, effectively and efficiently and makesit accessible to the interested viewers. Itmotivates us to a deep insight with quick access.
4. It gives us opportunity to approach huge dataand makes it easily comprehensible, be it thefield of entertainment, current affairs, financialissues or political affairs. It also builds in us adeep insight, prompting us to take a gooddecision and an immediate action if needed.This could be related to child education, peoplesuffering from health issues, market research ofa product, rainfall in a specific geographicalarea and many others.
5. Another scope of data visualization is that ithas emerged in the business world lately asgeo-spatial visualization.
6. The popularity of geo-spatial visualization hasoccurred due to lot of websites providing webservices, attracting visitor’s interest. This type ofbusiness needs to take advantage of locationspecific information which is already present inthe system in the form of customer‘s zip codeproviding better daily analysis experience. Thistype of visualization adds a new dimension tothe figures and helps in better understanding ofthe matter.
7. The leading benefit of Data visualization isthat it not only provides graphicalrepresentation of data but also allowschanging the form, omitting what is notrequired, and browsing deeper to getfurther details. This is a great eye catcherand attracts our attention better andprovides better communication. Thisprovides a great advantage over traditionalmethods. Visual analytics will provide greatbenefit to business houses.
8. With the help of it data can be viewed inmultiple ways effectively by dividing datafindings. It provides an additional sense to thedata by making patterns.Data visualization provides a perfect balancebetween visual appeal and practicality. It helpsto provide better efficiency in the presentedinformation. This form helps in quickunderstanding of data and reduces confusionand doubts.
9. In conclusion, I must say that we canutilize the full benefit of datavisualization if we pay requiredattention to it. Sometimes too manycolors create visual noise, where youcan’t measure properly. Hence the onlylimitation to taking advantage of datavisualization is unskilled eyes.

Category Archives: Data and Analytics

A slick chart, an interactive data-exploration interface or a KPI-based dashboard; all of these are data visualization products. They garner a lot of attention because they are a finished product, and look nice as well. However, for many companies engaged in data visualization, those final deliverables aren’t the most important benefit of data visualization. Instead, it’s the insights into the quality of their collected data that truly leads to success.

Data visualization provides 3 key insights into data:

Is the data complete?
Is the data valid?
Is the data well-organized?
Without knowing those 3 elements, data collection and business intelligence processes become much more expensive, labor intensive, and may end up abandoned when the data doesn’t demonstrate what is intended. Using the insights from data visualization, these projects can have a much higher likelihood of completion and success.

Insight into Data #1: Is the data complete?

The most straightforward insight that visualization can give you about your data is its completeness. With a few quick charts, areas where data is missing show up as gaps or blanks on the report (called the “Swiss Cheese” effect).

In addition to learning which specific data elements are missing, visualizations can show trends of missing data. Those trends can tell a story about the data collection process and provide insight into changes necessary in the way data is gathered.

A Data Completeness Example: After creating a visualization on a collection of survey data regarding movie-going habits, it’s clear than there are a significant number of blanks after question 14 on the survey. The visualization helps the survey company recognize that those specific records need to be abandoned, but also that the survey should be shortened to accommodate for “respondent fatigue”, the likely cause of the incompletions.

Insight into Data #2: Is the data valid?

The importance of visualization among data validation techniques has been discussed before. It’s clear, then, that visualization can play a pivotal role in understanding data’s validity. By executing a quick, preliminary visualization on collected data, trends that indicate problems in the complete data can be found.

A Data Validation Example: A collected dataset is designed to demonstrate the difference in male population statistics between Alaska and Florida. Examination of individual records and outliers show that the data is valid – there are a significantly higher percentage of males in Alaska than in Florida, this is expected. However, a visualization of the entire dataset shows that there are more males in Alaska than Florida. This is a red flag because, even with the gender ratio differences, Florida’s larger population means that it should have a higher total number of males.

A well-designed, preliminary visualization can give insight into the validity of collected data that is difficult, or even impossible, to gain with traditional methods.

Insight into Data #3: Is the data well-organized?

Poorly organized data can be the bane of the final step of a data collection or business intelligence process. Using data organization tools from the start can help streamline later steps of the process.

During collection, the data is often organized in a way that optimizes the gathering process. However, that same organizational scheme can be a problem when the time comes to act. The data visualization process serves to highlight the organizational challenges of your data and provides insights into how it might be done better.

A Data Organization Example: A client wishes to use their collected customer data to develop a customer profile that defines demographic breakouts of snack-food purchases indexed by time of day. Their data visualization partner asks them where that data is stored and it is discovered that the transactional data is stored separately from the customer profile information, and that data can only be intersected through yet another correlational dataset. While all the data is technically available, the data needs to be reorganized to be functional in decision making.

Data visualization isn’t just data organization and analysis tool; it can play a crucial role in the entire data gathering and management process. With a well-executed visualization, taking time to understand what is to be learned from the data and how the information will be gathered, companies are able to cut costs and eliminate the waste that comes from having to re-gather or re-organize their data.

To find out what your data has to say to you, contact Boost Labs to learn about creating a visualization to give you the insights your project needs to succeed.

Wednesday, January 27, 2016

Functions of the Bureaucracy

Functions of the Bureaucracy

America's bureaucracy performs three primary functions to help keep the governmental beehive buzzing along.

1. The bureaucracy implements the laws and policies made by elected officials.

These laws and policies need to be put into practice in specific situations and applied in all the contingencies of daily life. For example, a city council has decided that all dog owners must have their pets licensed and microchipped, but the city council members don't have the time to make sure that their decision is carried out. City workers, members of the city's bureaucracy, are the ones who answer questions and complaints about the law, help dog owners fill out the proper forms, decide when to waive the license fee, refer owners to veterinarians who can insert the microchips, work with the vets to hand out coupons for discounts on microchips, and enforce the law to make sure that all dog owners have their animals licensed and microchipped in a reasonable amount of time.

2. The bureaucracy provides necessary administrative functions, like conducting examinations, issuing permits and licenses, and collecting fees.

Essentially, it handles the paperwork of everyday government operations. Anyone who has a driver's license has come face-to-face with bureaucratic administration through the required written and behind-the-wheel exams, learning permits, fees at all stages, and finally applying for and receiving the driver's license itself.

3. The bureaucracy regulates various government activities.

In other words, it creates the rules and regulations that clarify how various laws work on a daily basis. For instance, the bureaucracy is responsible for writing rules and regulations for public schools, including curriculum standards, examination procedures, discipline methods, teacher training and licensing requirements, and administrative policies. Schoolchildren feel the effects of these regulations when they work on their assignments or take standardized tests. Teachers use them to design class work and assessments. Principals and school boards must follow them when applying for funding or setting policies for their own schools and districts.
The Face of Bureaucracy

The bureaucracy can seem harsh and faceless to many Americans, who often get fed up with its strict rules and time-consuming procedures, but in fact, most bureaucrats, people who work in the bureaucracy, are simply their neighbors and fellow citizens. Who are all these busy bee bureaucrats who implement, administer, and regulate citizens' interaction with the government? A few interesting facts will introduce us to them.

Channels To Market

Channels To Market

Entering International Markets

Modern companies need to plan for growth and survival in the globalized world of business competition. Some will choose to conduct business from home taking on competitors in the safety of their domestic market. Other companies will decide to go international operating from both domestic and foreign markets.

In order to do this the latter will have to make use of entry strategies to sustain their presence in foreign market. An entry strategy is a process of deciding on the direction of a company’s international business by combining reason with empirical knowledge.

The entry-strategy time frame will be an average of 3 to 5 years. Yet once operations in a foreign market begin, the manager may decide to revise entry strategy decisions.

A foreign market entry strategy is an international business plan, its aim is to lay down:

• Objectives
• Resources
• Policies

These will guide a company’s international business long enough to achieve sustainable growth in world markets.

Managers need to tailor an entry strategy for each product or foreign market. Cultural differences in foreign markets mean that a single entry strategy would not necessarily function in another setting. Different products and different markets will inevitably draw different responses.

If dealing with many foreign markets, the logistics of devising a different entry strategy for each is a daunting task. Clusters of similar country markets can be identified which can be entered with a similar strategy.

Commonly used entry strategies

Most managers enter foreign markets through exporting. By using an indirect channel (e.g. an export management company), a manufacturer can begin exporting with low start up costs; it allows him to tests the result of his product in a low-risk experimental fashion. Yet indirect exporting leads to an inevitable lack of control over his foreign sales. After initial exposure into the foreign market more ambitious managers will prefer using a direct export channel.

1. Direct export channels

Agents

Definition: An individual or legal entity authorized to act on behalf of another individual or legal entity (the principal). An agent’s authorized actions will bind the principal. A sales representative, for example, is an agent of the seller.

A foreign agent is an independent middleman representing the manufacturer in the foreign market. He sells the product on a commission basis. The manufacturer will receive orders from his agent and ship directly to his foreign buyer.

Agent channels have lower start up costs and are commonly used for early export entry.

Distributors

Definition: An agent who sells directly for a supplier and maintains an inventory of the supplier’s products.

A foreign distributor buys the manufacturer’s product for resale to middlemen or final buyers. He has more functions than an agent (maintaining inventories providing after-sales services) and assumes the ownership risk. He obtains a profit margin on resale of the product.

An agent or a distributor?

• Determine their profit contribution by estimating their respective sales and costs
• Also consider control, risk and other channel specifications
• Monitor channel performance to know if you need to change your channel arrangements

Choosing a foreign agent/ distributor

• Draw up an agent/ distributor profile list listing all their desired qualities in that particular foreign market
• After desk research, personally interview the best prospects
• Draw up a written contract with your foreign representative putting special emphasis on exclusive rights, the resolution of dispute and contract termination
• Remember ‘your line is only as good as your foreign representative’ and invest the necessary time and effort in finding the appropriate candidate and on building an efficient ‘export channel team’

Subsidiary/ Branch channels

Subsidiary
Definition: any organisation more than 50 per cent of whose voting stock is owned by another firm. Wholly owned subsidiaries is where a firm owns 100 per cent of the stock, it can set up a new operation in that country known as a green-field venture or it can acquire an established firm in that host nation and use that firm to promote it’s products

Branch

Definition: a local office, shop or group that is part of a larger organisation with a main office elsewhere.

Requires the manufacturer to establish their own sales operation in the target country. This channel type provides more control over the foreign marketing plan than the agent/ distributor option, yet it has higher fixed costs.

2. Joint Ventures

Definition:

1. A combination of two or more individuals or legal entities who undertake together a transaction for mutual gain or to engage in a commercial enterprise together with mutual sharing of profits and losses.
2. A form of business partnership involving joint management and the sharing of risks and profits as between enterprises based in different countries. If joint ownership of capital is involved, the partnership is known as an equity joint venture.

Advantages:

• Combined resources can exploit a target market more effectively
• May be the only investment opportunity if host governments prohibit sole ventures (commonly developing or communist countries)
• The local partner reduces the foreign partner’s investment to risk exposure
• Attractive to foreign companies with little experience in foreign ventures
• The local partner contributes his knowledge of local customs, business environment and important contacts e.g. with customers and suppliers

Disadvantages:

• Can lead to a loss of control over foreign operations
• The interests of local partners must be accommodated
• Can be obstacles to the creation of global marketing and production systems

Choosing the right partner

• Determine what you want the joint venture to accomplish
• How does the venture fit with your overall international business strategy
• Find out the objectives of your local partner and the resources he could bring

Tips for a successful joint venture

• Negotiate ownership shares, dividend policy, management issues, dispute settlement etc.
• Build a strong business relationship supporting the common venture
• You may wish to learn the language/ cultural habits of your foreign partner to strengthen your business relationship

3. Contractual entry modes

Licensing

Definition: The transfer of industrial property rights, patents, trademarks or proprietary know-how from a licensor in one country to a licensee in a second country.

For manufacturer’s wanting an aggressive foreign-market entry strategy licensing is not the best option. Only when export or investment entry is not feasible do they consider it. Small manufacturers are more prone to using licensing as an entry mode since it offers a low-commitment entry mode. Licensing can be combined with other entry modes and it’s most popular form is licensing/ equity mixes allowing the manufacturer to benefit from the growth of the licensee firm.

Advantages:

• The circumvention of import restrictions and transportation costs, since only intangible property rights and technology are transferred
• It requires no fixed investments by the manufacturer
• Licensing arrangements are exposed to less political risks than foreign investments

Disadvantages:

• No quality control is maintained over the licensed product
• No control over the licensee’s volume of production or marketing strategy
• The licensed product’s market performance depends on the licensee
• A lower size of returns is obtained compared to export or investment e.g. royalty rates rarely exceed 5 per cent of the licensee’s net sales

Opportunity costs:

• The creation of a competitor in third markets or a manufacturer’s home market
• The exclusivity of a licensing agreement prevents the licensor from marketing the product in the licensee country even if he is failing to exploit market opportunity

Franchising

Definition: to authorize others to use a company’s name and sell its goods.

Franchising is different than conventional licensing since: the franchisor licenses a business system to an independent franchisee in a foreign country. He carries on the business under its trade name and in accordance with the franchise agreement, reproducing the products or services of the franchisor in the foreign country.

longchamp pas cher , Christian Louboutin Uk shoes , sac louis vuitton pas cher , scarpe nike , longchamp sac , tn pas cher
Choosing the right entry mode

Managers must decide on the correct entry mode for a particular product/ country, this is done by following on of three rules:

1. The Naive rule - whereby managers use the same entry mode for exporting to all their target countries, by far the riskiest option since managers can end up using an inappropriate entry mode for a particular foreign country or forsake promising foreign markets
2. The Pragmatic rule - whereby managers start by assessing export entry and change their entry strategy accordingly, this saves time and effort yet ultimately fails to bring managers to the appropriate mode
3. The Strategy rule - whereby managers use right entry mode as a key to the success of their foreign entry strategy, making systematic comparisons of all entry modes. It is the most complicated method yet results in better entry decisions

What is a group?

What is a group?
circle_of_friends_fredarmitage_flickr_cc.jpg
What is a group? How are we to approach groups? In this article we review the development of theory about groups. We look at some different definitions of groups, and some of the key dimensions to bear in mind when thinking about them.

contents: introduction · the development of thinking about groups · defining ‘group’ · types of group · the benefits and dangers of groups · some key dimensions of groups [group interaction, group interdependence, group structure, group goals, group cohesiveness] · group development · conclusion · further reading and references · how to cite this article

Groups are a fundamental part of social life. As we will see they can be very small – just two people – or very large. They can be highly rewarding to their members and to society as a whole, but there are also significant problems and dangers with them. All this makes them an essential focus for research, exploration and action. In this piece I want to examine some of the key definitions of groups that have appeared, review central ways of categorizing groups, explore important dimensions of groups, and look briefly at the group in time.
The development of thinking about groups

Just how we define ‘group’ and the characteristics or ideas we use has been a matter of debate for many years. The significance of collectivities like families, friendship circles, and tribes and clans has been long recognized, but it is really only in the last century or so that groups were studied scientifically and theory developed (Mills 1967: 3). In the last decade of the nineteenth century, Émile Durkheim established just how wrapped up individual identity was with group membership, and Gustave Le Bon argued that people changed as they joined groupings such as crowds. Soon North American sociologists such as Charles Horton Cooley (1909) began to theorize groups more closely – and this was followed by others looking at particular aspects or types of group. Two well known examples are Frederic Thrasher’s (1927) exploration of gang life and Elton Mayo’s (1933) research on the informal relationships between workers in teams. A further, critical, set of interventions came from Kurt Lewin (1948; 1951) who looked to the dynamic qualities of groups and established some important parameters with regard to the way they were to be studied.

As interest in group processes and group dynamics developed and accelerated (most particularly since the 1980s) the research base of the area strengthened. Not unexpectedly, the main arenas for the exploration of groups, and for building theory about them, have continued to be sociology and social psychology. As well as trying to make sense of human behaviour – why people join groups and what they get from them (both good and bad) – the study of groups has had a direct impact on practice in a number of areas of life. Perhaps the most obvious is work – and the contexts and practices of teams. But it has also acted as a spur to development in those fields of education, therapy, social care and social action that use groups to foster change.
Defining ‘group’

As researchers turned to the systematic exploration of group life, different foci for attention emerged. Some social psychologists, for example, looked at the ways in which, for example, working in the presence of others tend to raise performance (Allport 1924). Others looked at different aspects of group process. Kurt Lewin (1948), for example, found that nearly all groups were based on interdependence among their members – and this applied whether the group was large or small, formally structured or loose, or focused on this activity or that. In a famous piece Lewin wrote, ‘it is not similarity or dissimilarity of individuals that constitutes a group, but interdependence of fate’ (op. cit.: 165). In other words, groups come about in a psychological sense because people realize they are ‘in the same boat’ (Brown 1988: 28). However, even more significant than this for group process, Lewin argued, is some interdependence in the goals of group members. To get something done it is often necessary to cooperate with others.

Interdependence has, thus, come to play a significant role in the way that many writers define group (e.g., Cartwright and Zander 1968), Others have stressed how people categorize themselves as members of something (Turner 1987) or share identity (Brown 1988) (see Exhibit 1). Others might look to communication and face-to-face encounters (Homans 1950), purpose (Mills 1967), structure and so on. As a starting point though, I have found Forsyth’s (2006) definition the most helpful:

Hundreds of fish swimming together are called a school. A pack of foraging baboons is a troupe. A half dozen crows on a telephone line is a murder. A gam is a group of whales. But what is a collection of human beings called? A group. …. [C]ollections of people may seem unique, but each possesses that one critical element that defines a group: connections linking the individual members…. [M]embers are linked together in a web of interpersonal relationships. Thus, a group is defined as two or more individuals who are connected to one another by social relationships. Donelson R. Forsyth (2006: 2-3) [emphasis in original]

This definition has the merit of bringing together three elements: the number of individuals involved; connection, and relationship. When people talk about groups they often are describing collectivities with two members (a dyad) or three members (a triad). For example, a work team or study group will often comprise two or three people. However, groups can be very large collectivities of people such a crowd or religious congregation or gathering. As might be expected, there are differences in some aspects of behaviour between small and larger groupings (see below), yet there remain significant commonalities.
Exhibit 1: Some definitions of a group

Conceiving of a group as a dynamic whole should include a definition of group that is based on interdependence of the members (or better, the subparts of the group). Kurt Lewin (1951: 146)

We mean by a group a number of persons who communicate with one another often over a span of time, and who are few enough so that each person is able to communicate with all the others, not at second-hand, through other people, but face-to-face. George Homans (1950: 1)

To put it simply they are units composed of two or more persons who come into contact for a purpose and who consider the contact meaningful. Theodore M. Mills (1967: 2)

A group is a collection of individuals who have relations to one another that make them interdependent to some significant degree. As so defined, the term group refers to a class of social entities having in common the property of interdependence among their constituent members. Dorwin Cartwright and Alvin Zander (1968: 46)

Descriptively speaking, a psychological group is defined as one that is psychologically significant for the members, to which they relate themselves subjectively for social comparison and the acquisition of norms and values, … that they privately accept membership in, and which influences their attitudes and behaviour. John C Turner (1987: 1-2)

A group exists when two or more people define themselves as members of it and when its existence is recognized by at least one other. Rupert Brown (1988: 2-3)

In part differences in definition occur because writers often select those things that are of special importance in their work and then posit ‘these as the criteria for group existence’ (Benson 2001: 5). This said, it is possible, as Jarlath F. Benson has done, to identify a list of attributes:

A set of people engage in frequent interactions
They identify with one another.
They are defined by others as a group.
They share beliefs, values, and norms about areas of common interest.
They define themselves as a group.
They come together to work on common tasks and for agreed purposes (Benson 2000: 5)

From this she suggests that groups are intended and organic. They are not some random experience and as a result they have three crucial characteristics:

There are parts
There is relationship between the parts
There is an organizing principle (op. cit.).

To this we might also add, as both John C. Turner (1987) and Rupert Brown (1989) have pointed out, groups are not just systems or entities in their own right but exist in relation to other groups.
Types of groups

There are various ways of classifying groups, for example in terms of their purpose or structure, but two sets of categories have retained their usefulness for both practitioners and researchers. They involve the distinctions between:

primary and secondary groups; and
planned and emergent groups.

Primary and secondary groups

Charles Horton Cooley (1909) established the distinction between ‘primary groups’ and ‘nucleated groups’ (now better known as secondary groups):

Primary groups are clusters of people like families or close friendship circles where there is close, face-to-face and intimate interaction. There is also often a high level of interdependence between members. Primary groups are also the key means of socialization in society, the main place where attitudes, values and orientations are developed and sustained.

Secondary groups are those in which members are rarely, if ever, all in direct contact. They are often large and usually formally organized. Trades unions and membership organizations such as the National Trust are examples of these. They are an important place for socialization, but secondary to primary groups.

This distinction remains helpful – especially when thinking about what environments are significant when considering socialization (the process of learning about how to become members of society through internalizing social norms and values; and by learning through performing our different social roles). The distinction helps to explain the limited impact of schooling in important areas of social life (teachers rarely work in direct way with primary groups) and of some of the potential of informal educators and social pedagogues (who tend to work with both secondary and primary groups – sometimes with families, often with close friendship circles).
Planned and emergent groups

Alongside discussion of primary and secondary groups, came the recognition that groups tend to fall into one of two broad categories:

Planned groups. Planned groups are specifically formed for some purpose – either by their members, or by some external individual, group or organization.

Emergent groups. Emergent groups come into being relatively spontaneously where people find themselves together in the same place, or where the same collection of people gradually come to know each other through conversation and interaction over a period of time. (Cartwright and Zander 1968).

As Forsyth (2006: 6) has put it ‘People found planned groups, but they often find emergent groups’. Sometimes writers use the terms ‘formed’ groups and ‘natural groups’ to describe the same broad distinction – but the term ‘natural’ is rather misleading. The development of natural groups might well involve some intention on the part of the actors.

More recently the distinction between formed and emergent groups has been further developed by asking whether the group is formed by internal or external forces. Thus, Arrow et. al (2000) have split planned groups into ‘concocted’ (planned by people and organizations outside the group) and ‘founded’ (planned by a person or people who are in the group). They also divided emergent groups into ‘circumstantial’ (unplanned and often temporary groups that develop when external forces bring people together e.g. people in a bus queue) and ‘self-organizing’ (where people gradually cooperate and engage with each other around some task or interest).
Some benefits and dangers of groups

As can be seen from what we have already reviewed, groups offer people the opportunity to work together on joint projects and tasks – they allow people to develop more complex and larger-scale activities. We have also seen that groups can be:

significant sites of socialization and education – enabling people to develop a sense of identity and belonging, and to deepen knowledge, skills, and values and attitudes.
places where relationships can form and grow, and where people can find help and support.
settings where wisdom flourishes. As James Suriwiecki (2004) has argued, it is often the case that ‘the many are smarter than the few’.

However, there is a downside to all this. The socialization they offer might be highly constraining and oppressive for some of their members. They can also become environments that foster interpersonal conflict. Furthermore, the boundaries drawn around groups are part of a process of excluding certain people (sometimes to their detriment) and creating inter-group conflict. There is also evidence to show that groups can impact upon individuals in ways that warp their judgements and that lead to damaging decision making (what some commentators have talked about as ‘groupthink’).

For these reasons we need to be able to appreciate what is going on in groups – and to act where we can to make them more fulfilling and beneficial to their members and to society as a whole.
Some key dimensions of groups

Those engaged in the systematic exploration of group processes and dynamics have used different ways of observing group behaviour and gaining insight into the experience of being part of groups. Some have tried for more of an ‘insider’ view using participant observation and conversation. Perhaps the best known example of this was William F. Whyte’s (1943) study of street corner society. Others have used more covert forms of observation, or looked to structured and overt observation and interviews. A classic example of the sort of scheme that has been used when looking at groups in more structured ways is Robert Freed Bales’ (1950) IPA system (Interaction Process Analysis) with its 12 different ways of coding group behaviour e.g. ‘shows solidarity’, ‘agrees’, ‘asks for opinion’ and so on.

All this research, and the contrasting orientations informing it, has generated different ideas about what to look out for in groups and, in particular, the forces impacting upon group processes and dynamics. I want to highlight five:

Group interaction
Group interdependence
Group structure
Group goals
Group cohesion (and entitativity)

There are various ways of organizing and naming the significant qualities – but I have found this approach (taken from Donelson R. Forsyth 1990: 8-12; 2006: 10-16) to be the most helpful way to start exploration.
Group interaction

Those involved with researching and working with groups have often come at interaction – the way in which people engage with and influence each other – from contrasting perspectives. As we have already seen, Bales (1950, 1999) looked at categorizing social interventions in terms of the ways in which they appear to impact on group process – and in particular the extent to which they looked to ‘getting on with the job’ or ‘having regard for others’ (Brown 1988: 19). This distinction has turned out to be one of the most enduring features of much that has been written about groupwork.

Task interaction can be seen as including ‘all group behaviour that is focussed principally on the group’s work, projects, plans and goals’ (Forsyth 2006: 10).

Relationship interaction (or socio-emotional interaction) is centred around the social and interpersonal aspects of group life.

This distinction has found its way into different aspects of practice – for example when thinking about leadership in groups (whether leaders focus on structure and task actions, or on the feelings and needs of the group members) (see, in particular, Hersey and Blanchard 1977). Thus actions can be categorized into whether they are concerned with task or maintenance (sometimes also described respectively as instrumental or expressive interventions) (Brown 1994: 71).
Group interdependence

As Robert S Baron et. al. (2003: 139) have argued it is a basic feature of groups that group members’ outcomes often depend not only on their own actions, but also on the actions of others in the group. One member’s feelings, experiences and actions can come to be influenced in whole or in part by others. In all this it is also helpful to take up a distinction formulated by Morton Deutsch (1949) (one of Lewin’s graduate students) when looking at cooperation and competition in groups. He contrasted social interdependence – which exists when people share common goals and each person’s outcomes are affected by the actions of others – with social dependence where the ‘outcomes of one person are affected by the actions of a second person but not vice versa’ (Johnson and Johnson 2003: 94).
Group structure

Most commentators on group process and group dynamics discuss group structure – but just what they include under this heading differs. Here we are going to follow Forsyth (2006: 11) and define group structure as the ‘[n]orms, roles and stable patterns of relationship among the members of the group’.

Group size. An obvious but crucial consideration is the size of the group. Large groups function differently in a number of important respects to smaller groups. Size impacts on group communication, for example. In smaller groups a higher proportion of people are likely to participate – there is potential more time for each, and the smaller number of people involved means that speaking may not be as anxiety-making as in a large group. In addition, large groups are more likely to include people with a range of skills and this can allow for more specialization of labour. In addition, larger groups can also allow us to feel more anonymous. ‘As a result, we may exhibit less social responsibility…, which in turn will often lead to less task involvement and lower morale on the part of many group members as size increases’ (Baron2003 et. al.: 7).

Group norms. Norms are basically rules of conduct that indicate what attitudes and behaviour might be expected or demanded in particular social situations and contexts. They are shared expectations of behaviour that set up what is desirable and appropriate in a particular setting or group. However, as soon as we talk about expected behaviour there is room for confusion. Here the norm is not referring to what is likely to occur, but what we think should occur. For example, we can expect a certain level of violence in town centres as the bars and clubs close, but most people would probably say that it shouldn’t be happening.

Socially established ‘and shared beliefs regarding what is normal, correct, true, moral and good generally have powerful effects on the thoughts and actions of group members’ (Baron et. al. 2003: 6). Group norms develop in groups often because they are necessary for the group to survive and/or to achieve its ends. Group life is dependent upon trust and a certain amount of loyalty, for example. Furthermore, as Baron et al have commented, norms provide codes of behaviour that render social life more predictable and efficient’ (op. cit.). They also act to reduce uncertainty in difficult situations. They provide a way forward for interaction.

Roles. The bundle of expectations and attributes linked to a social position can be seen as a role. In groups, people expect certain sorts of behaviour from those they see as the leader, for example. Various different ways of conceptualizing role have emerged in the study of groups e.g. ‘information giver’, ‘harmonizer’, ‘recorder’ and so on. Some of these schemes are helpful, some are not – but what cannot be disputed is the significance of role in groups. Different people play different roles – sometimes these are assigned (such as the in the membership of committees), sometimes they emerge through interaction. As Johnson and Johnson (2003: 24) have put it, ‘Roles define the formal structure of the group and differentiate one position from another’. Crucially, different social roles are often linked to different degrees of status and power within the group.
Group goals

An obvious, but sometimes overlooked, factor in group processes and dynamics is the reason why the group exists. What does it do for its members? What is its object? How did it come to be created? As Alvin Zander (1985: 1-13) has shown, the form that a group takes is often heavily dependent on its purpose. Moreover, a group will often have several and possibly conflicting purposes which can then become expressed as tensions between members.

Group goals are ideals – they are the ends (the aims or the outcomes) sought by the group and its members. They entail some sort of joint vision (Johnson and Johnson 2003: 73). Without some commitment to the pursuit of common goals the group will not survive or be effective (Benson 2001: 66). Of great significance then is what might be called goal structure. Here a key distinction is between cooperative and competitive goal structures:

A co-operative goal structure develops when the individual goals of members are visible and similar… A competitive goal structure emerges where the individual goals of members are hidden or seen as different or opposed. (Benson 2001: 67)

Hidden agendas can be very destructive and lead to conflict in the group.
Group cohesion

Forsyth (2006: 13) makes the point that ‘Groups are not merely sets of aggregated, independent individuals; instead they are unified social entities. Groups cannot be reduced down to the level of the individual without losing information about the group unit, as a whole’. The notion of group cohesion – the forces or bonds that bind individuals to the collectivity – is fundamental to an appreciation of groups. In some groups the power of the bonds, the feelings that group members have for each other and the extent to which they are prepared to cooperate to achieve their goals will be slight. In others these may be seen as strong. Here the word ‘seen’ is significant – for it may well be that a group is not experienced by its members as particularly co-operative, for example, but they, and those looking on, may believe it to be a social entity, a whole.

In recent years there has been a growing literature around ‘group entitativity’ – the degree to which something appears to be a unified entity. Another way of thinking about this is as the ‘groupness’ of the people you might be observing in a particular situation (Brown 1999). It was Donald T. Campbell (1958) who first used the term entitativity. He argued that when groups become real they possess the characteristics of entities (Forsyth 2006: 15). Campbell based his analysis on explorations into how the mind works when deciding when something is to be approached as a whole (a gestalt or something that cannot be described as the sum of its parts) or ‘a random collection of unrelated elements’ (Forsyth 2006: 15). When looking at people together in particular places (what he calls the ‘aggregate’) Campbell concluded that we depend on three main cues to make judgements about entitativity:

Common fate – the extent to which individuals in the ‘aggregate’ seem to experience the same, or interrelated outcomes.
Similarity – the extent to which the individuals display the same behaviours or resemble one another.
Proximity – the distance among individuals in the ‘aggregate’ (or group). (described in Forsyth 2006: 15)

We might look, thus, at people seated around a table in a café or bar – we look at the extent to which they join in things together e.g. laughing, discussing; whether they acting in a similar way or have something in common e.g. in the way they dress, the things they have with them; and how closely they are sitting together.
Group development

Groups change over time. There is a real sense in which they are living things. They emerge, they exist, and they die. This phenomenon has led to the formulation of a wide range of theoretical models concerning developmental processes. Most commentators assume that groups go through a number of phases or stages if they exist for an extended period. It is clear, for example, that people tend to want to know something about the other members; have to develop a degree of interdependence in order that the group or team may achieve its tasks and be satisfying to its members; and has to learn at some level to deal with conflict if it is to survive. The most influential model of the developmental process – certainly in terms of its impact upon texts aimed at practitioners – has been that of Bruce W. Tuckman (1965). While there are various differences concerning the number of stages and their names – many have adopted a version of Tuckman’s model – forming, storming, norming and performing.

illustration - a cyclical version of Bruce W. Tuckman's group development model

He was later to add a fifth stage – adjourning (Tuckman and Jensen 1977) [all discussed at length in Bruce W. Tuckman – forming, storming norming and performing in groups].
Conclusion

From this brief overview we can see the significance of groups and why it may be important to intervene in them – both to strengthen their potential as sites of mutual aid and communal well-being, and to help them become more fulfilling to their individual members. They are a fundamental part of human experience and play a crucial role both in terms of shaping and influencing individual lives and society itself.

Humans are small group beings. We always have been and we always will be. The ubiquitousness of groups and the inevitability of being in them makes groups one of the most important factors in our lives. As the effectiveness of our groups goes, so goes the quality of our lives.

To ensure that groups are effective, members must be extremely competent in using small group skills. Humans are not born with these skills; they must be developed. (Johnson and Johnson 2003: 579; 581)

Those skills – and the attitudes, orientations and ideas associated with them – are learnt, predominantly, through experiencing group life. They can also be enhanced by the intervention of skilled leaders and facilitators – but that is another story [see working with groups].
Further reading and references

Forsyth, Donelson R. (2006) Group Dynamics 4e [International Student Edition]. Belmont CA.: Thomson Wadsworth Publishing. 682 + xxii pages. Pretty much the standard textbook on groups, it has gone from strength to strength through its four editions.

Johnson, David W. and Frank P. Johnson (2009) Joining Together. Group theory and group skills 10e. Boston: Merrill. 660 + xii pages. Still the best starting point for an exploration of groupwork practice. It begins by providing an overview of group dynamics and experiential learning and then looks at key dimensions of group experience and the role of the leader/facilitator.
References

Allport, F. H. (1924) Social Psychology. Boston: Houghton Mifflin.

Bales, Robert Freed (1950) Interaction Process Analysis: A method for the study of small groups. Chicago: University of Chicago Press.

Bales, Robert Freed (1999) Social Interaction Systems: Theory and measurement. New Brunswick, NJ.: Transaction.

Baron, Robert S and Norbert L Kerr (2003) Group Process, Group Decision, Group Action. 2e. Buckingham: Open University Press.

Benson, Jarlah. (2000) Working More Creatively with Groups. London: Routledge.

Brown, Rupert (1999) Group processes: Dynamics within and between groups 2e. Oxford: WileyBlackwell.

Campbell, Douglas T. (1958) ‘Common fate, similarity, and other indices of aggregates of persons as social entities’, Behavioral Science 3: 14-25.

Cartwright, Dorwin and Alvin Zander (eds.) (1968) Group dynamics: research and theory 3e. London: Tavistock Publications.

Cooley, C. H. (1909) Social Organization. A study of the larger mind. New York: Scribners.

Deutsch, Morton (1949) ‘A theory of cooperation and competition’, Human Relations 2: 129-152.

Doel, Mark (2005) Using Groupwork. London: Routledge.

Durkheim, Émile (2002) Suicide. London: Routledge. [First published in 1897]

Forsyth, Donelson R. (1990) Group Dynamics 2e. Pacific Grove CA.: Brooks Cole.

Forsyth, Donelson R. (2006) Group Dynamics 4e [International Student Edition]. Belmont CA.: Thomson Wadsworth Publishing.

Hersey Paul and Blanchard, Kenneth (1977) Management of Organizational Behaviour: Utilizing human resources. 3e. Englewood Cliffs, NJ.: Prentice Hall.

Homans, George (1951) The Human Group, London: Routledge and Kegan Paul.

Johnson, David W. and Frank P. Johnson (2003) Joining Together. Group theory and group skills 8e. Boston: Allyn and Bacon.

Le Bon, Gustave (2006) The Crowd. A study of the popular mind. New York: Cosimo Books. [First published in English in 1896].

Lewin, Kurt (1948) Resolving social conflicts; selected papers on group dynamics. Gertrude W. Lewin (ed.). New York: Harper & Row, 1948.

Lewin, Kurt (1951) Field theory in social science; selected theoretical papers. D. Cartwright (ed.). New York: Harper & Row.

Mayo, Elton (1933) The Human Problems of an Industrial Civilization. New York: Macmillan.

McDermott, Fiona (2002) Inside Group Work. A guide to reflective practice. Crows nest NSW.: Allen and Unwin.

Mills, Theodore M. (1967) The Sociology of Small Groups. Englewood Cliffs, N.J.: Prentice-Hall.

Surowiecki, James (2004) The Wisdom of Crowds. Why the many are smarter than the few. London: Abacus.

Thrasher, F. (1927) The Gang. Chicago: University of Chicago Press.

Tuckman, Bruce W. (1965) ‘Developmental sequence in small groups’, Psychological Bulletin, 63, 384-399. The article was reprinted in Group Facilitation: A Research and Applications Journal · Number 3, Spring 2001 and is available as a Word document: http://dennislearningcenter.osu.edu/references/GROUP%20DEV%20ARTICLE.doc. Accessed January 14, 2005.

Tuckman, Bruce W., & Jensen, Mary Ann C. (1977). ‘Stages of small group development revisited’, Group and Organizational Studies, 2, 419- 427.

Turner, J. C. with M. A. Hogg (1987) Rediscovering the social group : a self-categorization theory. Oxford: Basil Blackwell.

Whyte, William Foote (1943, 1955, 1966, 1981, 1993) Street Corner Society: social structure of an Italian slum. Chicago: University of Chicago Press.

Zander, Alvin (1985) The Purposes of Groups and Organizations. San Francisco: Jossey-Bass.

Acknowledgements: Acknowledgements: The picture – Circle of friends – is by FredArmitage/flickr Sourced from Flickr and reproduced here under a Creative Commons Attribution-Non-Commercial-No Derivative Works 2.0 Generic licence. https://www.flickr.com/photos/fredarmitage/48405833/

How to cite this article: Smith, Mark K. (2008). ‘What is a group?’, the encyclopaedia of informal education. [www.infed.org/mobi/what-is-a-group/. Retrieved: insert date].

© Mark K Smith 2005, 2008
Print Friendly
Share this:

Click to email this to a friend (Opens in new window)
Share on Facebook (Opens in new window)
Click to share on Twitter (Opens in new window)
Click to share on LinkedIn (Opens in new window)
Click to share on Google+ (Opens in new window)
More