CONVERGENCE IN INFORMATION AND COMMUNICATION TECHNOLOGY ( ICT ) USING PATENT ANALYSIS

Since the 1990s, information and communication technology (ICT) has been perceived as the critical technology for economic development, and the ICT industry itself has been growing exceptionally fast. Moreover, technology convergence in ICT has received particular attention. ICT innovations diffuse into existing products and thus come to form a new integral part of the goods. This is an exploratory research to examine technology convergence of the supply side as a firm level in the ICT sector using International Patent Classification (IPC) of 43,636 sample patents from 1995 to 2008. This study finds a degree of merger and relationships between different technology domains through the association rule mining of patent coclassification. This type of analysis helps companies to take strategies under the environment of technological trajectory change. Keyword: technological convergence; information and communication technology (ICT); patent; international patent classification (IPC); co-classification; association rule mining 54 Kim, E., Kim, J., Koh, J. JISTEM, Brazil Vol. 11 No.1,Jan/Apr 2014, pp. 53-64 www.jistem.fea.usp.br


INTRODUCTION
Since the 1990s, information and communications technology (ICT) has been perceived as the essential technology for economic development, and ICT industries themselves have been growing exceptionally fast.Moreover, technology convergence has received particular attention.ICT innovations diffuse into existing products and thus come to form a new integral part of the goods.
The creation of synergies, blurring of industry boundaries, integration, and overlapping of markets are all used to describe convergence.The convergence phenomenon has been mainly observed and discussed in ICT sectors.ICT innovations diffuse into existing products and thus come to form a new integral part of the goods.Patents play an increasingly important role in innovation and patent data are used to indicate innovative activity of companies, industries and countries.Patent analysis can be regarded as one of the most effective methods to keep in touch with technology trends (Karvonen and Kassi, 2010).Scholars used patent analysis to discuss convergence phenomenon (Duysters and Hagedoorn, 1998;Gambardella and Torrisi, 1998;Bröring, 2005;Curran and Leker, 2011;Kavon and Kassi, 2010).This is an exploratory research to examine technology convergence of the supply side as a firm level in the ICT sector using patent analysis.International Patent Classification (IPC) of 46,363 sample patents from 1995 to 2008 is employed to find a degree of merger and relationships between different technology fields through the association rule mining of patent co-classification.
The paper is structured as follows: Section 2 outlines shortly the background of convergence and presents the use of patent analysis.Section 3 presents the research setting, including research data and methodology.Section 4 indicates the empirical results.Finally, a discussion of this study and future research strands conclude this paper in Section 5.

BACKGROUND
Convergence is an often used but rarely defined concept.Ideas such as the creation of synergies, disappearance of industry boundaries, integration, or overlapping of markets, are all used to describe this phenomenon.Technological convergence is the tendency for different technological systems to evolve towards performing similar tasks (Wikipedia).Convergence can refer to previously separate technologies such as voices, data and videos that now share resources and interact with each other, synergistically creating new efficiencies.The phenomenon of convergence occurs when innovations emerge at the intersection of established and clearly defined industry boundaries, thereby sparking off an evolutionary development with a much broader impact.In recent industry developments within information technology (IT), bio-technology (BT) and nano-technology (NT), the convergence of technologies and knowledge bases has induced a variety of industrial points of inflection.Hence, industry boundaries have become blurred, and innovation does not take place within previously existing industrial silos anymore, but rather between them (Hacklin et al., 2009).
The dictionary definition of convergence is 'tendency to meet at a point' or 'gradual change so as to become similar or develop something in common'.The first use of the term convergence can be traced back to Rosenberg (1963) who introduced the label 'technological convergence' as a way to describe the evolution towards a specialized machine tool industry in the US in the late 1800s.Rosenberg's notion of technological convergence appears to have re-emerged in recent decades as way of describing the apparent merger of telecom, data communication, IT, media and entertainment into a giant ICT and multimedia industry (Gambardella and Torrisi, 1998).During the 90s, convergence was mainly discussed in the context of the merger of the IT, telecommunications, media and entertainment industries into a giant 'infocom' sector (Lind, 2004).
According to a study (Greenstein and Khanna 1997), it is possible to define two basic forms of convergence as substitutes and complements.Competitive convergence as a 'Substitutes Paradigm' occurs when products or services become interchangeable one for another to fulfill a set of certain user needs through bundling of functions.Complementary convergence as a 'Cooperative Paradigm' occurs when products or services from different industries are merged to meet a larger or new set of consumer needs simultaneously.Pennings and Puranam (2001) classified industry convergence from two dimensions such as substitution/complementation and supply/demand.Stieglitz (2003) suggested a similar classification but the dimension of supply/demand with the dimension of technology-based/product-based.Studies of Hacklin (2008) and Hacklin et al (2009) developed and discussed a process of four sequential convergence stages, which are knowledge, technology, application and industry convergence, with an evolutionary perspective.Curran and Leker (2011) discussed how to measure convergence under the sequential process which is evolving when scientific disciplines and technologies and/or markets have converged.Starting with scientific disciplines that begin to use more and more research results of one another, a scientific convergence will start with cross-disciplinary citations and eventually develop further into closer research collaborations.After the distance between basic science areas has been decreasing for some time, applied science and technology development should follow leading to technology convergence (Meyer, 2000;Murray, 2002;Bainbridge, 2006).Pennings and Puranam (2001) argued that based on a validity assumption for classification schemes like the IPC (International Patent Classification), convergence can be found in patent data through growing overlapping among IPCs and through an increase in patent citations between different classes.Patent analysis has been employed in the context of technology-driven convergence of electronics, computers, and telecommunication (Duysters and Hagedoorn, 1998;Gambardella and Torrisi, 1998;Bröring, 2005) as patents are often regarded as outcome indicators for organizations' R&D activities (Ernst, 1995;Fai and Tunzelmann, 2001).Curran and Leker (2009;2011) discussed convergence indicators using patent data.Also Xing and et al (2011) tried to measure industry convergence with input-out analysis.

RESEARCH METHODS AND SAMPLES
In this research, association rule mining is used to analyze technology integration and diversification in a firm level.Data mining, which is referred to as a knowledge discovery in a database, is a process of nontrivial extraction of implicit previously unknown and potentially useful information such as knowledge rules, constraints and regularities from data in a database (Chen et al., 1996).Data mining, which differs from traditional statistics in that formal statistical inference, is assumption-driven in the sense that a hypothesis is formed and validated against the data.Data mining, in contrast, is discovery-driven in the sense that patterns and hypotheses are automatically extracted from data (Zhang and Zhang, 2002).Data mining has made broad and significant progress since its early beginning in the 1980's.Today, data mining is used in a vast array of areas, and numerous commercial mining systems are available (Han et al. 2006).Association mining is one of the best-studied methods in data mining (Agrawal et al., 1993;Agrawal and Srikant, 1994;Chen et al., 1996;Han and Kamber, 2001).Since its introduction in 1993 (Agrawal et al.), the area of association rule mining has received a great deal of attention.
Association rule mining has been developed mainly to identify the relationships strongly associated among item sets that have high frequency and strong correlation.Association rules are produced by finding the interesting associations or correlation relationships among a large set of data items (Jiao and Zhang, 2005), and enable us to detect the items that frequently occur together in an application (Zhang and Zhang, 2002).
An association rule (Agrawal et al., 1993) is a probabilistic relationship, of the form A -> B, where A, B are disjoint itemsets.The intuitive meaning of such a rule is that the transactions (or tuples) that contain itemset A also tend to contain itemset B. An association rule indicates that the occurrence of a certain itemset in a transaction will imply the occurrence of another itemset in the same transaction.The rule suggests that a strong relationship exists between the itemsets.The association analysis is applicable to market basket data, bioinformatics, medical diagnosis, Web mining, and scientific data analysis (Tan et al., 2005).
The importance of a rule is usually measured by two numbers, support and confidence.These two properties provide the empirical basis for derivation of the inference expressed in the rule and a measure of the interest in the rule.The support for the association rule A -> B is the percentage of occurrences that contain both itemset A and B among all transactions.The confidence for the rule A -> B is the percentage of transactions that contain an itemset B among the transactions that contain an itemset A (Tan et al., 2005).The rule A -> B holds with support s if s% of transactions in the database contain both itemset A and itemset B. The rule A -> B holds with confidence c if c% of transactions that contain itemset A also contain itemset B. Association rule mining finds all rules in the database that satisfy some minimum support and minimum confidence constraints (Agrawal and Srikant 1994).Additionally, lift value (Brin et al., 1997) is used to judge the strength of an association rule.The lift of a rule A -> B is defined as support (AUB) / (support (A)*support (B)).A lift ratio greater than 1.0 suggests that there is some usefulness to the rule.The larger the lift ratio, the greater the strength of the association (Kim and Park, 2006).
In this study an itemset is regarded as a set which includes primary patent classification code and secondary patent classification codes of a patent, and a transaction means granting a patent.A patent has its classification code, which indicates its technology area.Patent information includes one primary classification code and additionally second classification codes.The patent co-classification shows clearly convergence between different technological domains (Curran and Leker, 2011).In this research, the association rule mining is applied to discover the linkage patterns, which show strongly related among various technology areas, based on the patent coclassification information.
In order to examine the technological convergence, 46,363 granted patents by IBM from 1995 to 2008 in USPTO are used in this study.IBM holds more patents than any other technology company, and has topped the list of the world's most inventive companies from 1993.
The International Patent Classification (IPC) provides for a hierarchical system of language independent symbols for the classification of patents and utility models according to the different areas of technology to which they pertain (WIPO).Each patent grant is assigned to IPC to determine the nature of the patent.One patent can be assigned to more than one IPC if the patent finds application in various domains.Each company has a few subclasses to which most of their patents are assigned.These subclasses describe their core technological competencies.If a company has granted patents only in a few subclasses, it can be said that the technologies employed by the company are highly focused on a narrow field of expertise.On the other hand, if all the patents are not concentrated in a few subclasses, research can be said to be diversified.
The IPC codes in a firm's patent records are identified and classified into technology fields representing the firm's technology domains.Patent application in each field indicates an accumulation of knowledge and advancement in the technological trajectory (Fai and Tunzelmann, 2001).IPC codes are a hierarchical way of assigning the category to which every patent belongs.There are 8 sections, 120 classes, 628 subclasses and about 70,000 groups.The 628 subclasses are aggregated into 35 technological fields, and for descriptive purposes these are further aggregated into five main categories: electrical engineering, instruments, chemistry, mechanical engineering and others, and 35 sub categories in the appendix, the IPC and Technology Concordance Table (WIPO, 2008).The subclasses of the sample patents are analyzed using association rule mining in this research.Among the 43,636 sample patents used for this analysis, 13,338 patents were assigned to more than one IPC..

ANALYSIS
In this study, a software package, R, is used to analyze the patent dataset.For the association rule mining, the 'apriori( )' algorithm, which is well known and included in the R package, 'arules' is executed.Thresholds for mining this dataset are 0.05% for minimum support and 90% for minimum confidence.The lift values of all association rules as the results of the association rule mining are greater than 1.0, indicating the usefulness of the rules and the strength of the association.
The table 1 shows the IPC statistics of the sample patent dataset.During fourteen years from 1995 to 2008, IBM's patents were diversified within 355 different technology fields.The major IPC subclass codes of the dataset are 'G06F' and 'H01L' as indicated in table 1.More than 60% of the company's granted patents during this period are included in these two subclasses.
In this study, the conditions for association rule mining of IPC co-classification, minimum support is 0.05% which is lower than a general threshold value.Support value to detect association rules is related to the frequency of occurrence.The main purpose of this study is to examine the technological convergence in a patent using its IPC coclassification.As displayed in the table1, more than half of the company's patents are focused on two technological domains.And others are dispersed into more than 300 different subclasses.In order to detect the association rules between different technology subclasses, the minimum support value should be lowered in this study.
Table 2 shows the results of association rule mining for co-classification of the IPC dataset.The association rules detected from the patent dataset shows clearly the relationships between different fields of technology.As mentioned above, subclasses of IPC can be grouped into 35 technology fields as IPC and Technology concordance table (WIPO, 2008).The 32 association rules can be interpreted based on the concordance table.The detected 32 association rules can be divided into five types.First, the association rules from #1 to #17 present the relationships between 'computer technology' and other fields such as 'telecommunications', 'digital communication' and 'semiconductors' in electrical engineering.Second, the association rules from #18 to #26 show the technological combinations between electrical engineering and instrument.Third, the association rule #27 indicates the technological integration among three domains: electrical engineering, instruments, and mechanical engineering fields.Fourth, the association rules #28, #29, and #30 show the technological combinations between electrical engineering and mechanical engineering.Fifth, the last two association rules #31 and #32 present the relationships between 'organic fine chemistry' and 'biotechnology' in chemistry.These two rules do not designate the convergence between 'computer technology' which the company are focused on and technology fields in chemistry.They indicate the combinations between 'organic fine chemistry' and 'macromolecular chemistry, polymers' fields.

DISCUSSION AND FUTURE DIRECTIONS
This study aims at examination of technology convergence of the supply side in the ICT sector through patent IPC analysis.Based on IPC of 46,363 sample patents from 1995 to 2008 in ICT, association rule mining of the sample patents' IPC is used to find degree of overlaps and relationships between different technology domains.The results of association rule mining of the ICT firm's patent co-classification show clearly convergence between different technological domains.Technological convergence can trigger market convergence with new product and firms begin to merge with each other, completing the convergence process with industry fusion, considering sequential process of convergence.Curran and Leker (2011) discussed the phase of convergence based on the assumption of an idealized time series of events: scientific/knowledge convergence, technology convergence, market/applicational convergence, and industry convergence.Also, as scholars suggested in the previous studies, convergence can be considered in several dimensions: supply/demand, substitution/complementation, and product-based/ technology-based.
The context of this study is the ICT technology convergence in a firm level as a technology supplier, considering degree and scope of technology convergence.The results imply that the technology convergence in a firm occurs mainly within its dominant technology areas and that the major technologies tend to be merged with other areas' technology, expending the scope of convergence.However, the results do not indicate the technological convergence change technological paradigm in a firm level.The scope of the convergence can be related to the firm's capability of innovativeness and technological competitiveness.As discussed in previous literature (Curran and Leker, 2011), convergence starts with knowledge convergence.Additionally, this study examines the usefulness of association rule mining to indicate technological convergence.Due to the technology development, new phenomena have appeared in the world.Some of them can be difficult to be described or analyzed by conventional methods.The association rule mining approach is appropriate for describing the complicate relational data and discovering important patterns among them.The association rule mining analysis can be applied to indicate not only technology convergence but also knowledge convergence.
Technological convergence, as an emerging research field, is being studied by many scholars.The impact of this new phenomenon is enormous.For example, after introducing the smartphone which is a representative outcome of the convergence, lots of new business models and applications related to the smartphone have appeared, and our life style has changed rapidly.In order to improve the capability of forecasting the technology development, in-depth understanding of the technological convergence phenomenon is essential.Even though, in this research, one factor, which is classification code of patent, is considered to discuss technological convergence, knowledge transfer with a perspective of open innovation can be deliberated together using the mining analysis.Therefore, the empirical analysis about the relationship between technology convergence and knowledge will be discussed in a further study.

Table 1 .
Patent Statistics for major IPC subclasses of sample dataset (n=43,636)

Table 2 .
Association rules* for IPC co-classification of sample dataset**