Pardoe, David and Peter Stone. "TacTex-03: A Supply Chain Management Agent". The University of Texas at Austin, Department of Computer Sciences. Report# AI04-308 (technical report). February 2004. 8 KEY WORDS: pages.
This paper introduces TacTex-03, an agent designed to participate in the Trading Agent Competition Supply Chain Management Scenario (TAC SCM). As specified by this scenario, TacTex-03 acts as a simulated computer manufacturer in charge of buying components such as chips and motherboards, manufacturing different types of computers, and selling them to customers. TacTex-03 was the top scorer in two of the preliminary rounds of the 2003 TAC SCM competition, and finished in 3rd place in the finals.
Jong, Nicholas K. and Peter Stone. "Towards employing PSRs in a continuous domain". The University of Texas at Austin, Department of Computer Sciences. Report# AI04-309 (technical report). February 23, 2004. 4 KEY WORDS: pages.
Predictive State Representations (PSRs) recently emerged as an alternative framework for reasoning about stochastic environments. However, unlike Markov decision processes, they have not yet been extended to large domains or domains with continuous state variables. This report briefly describes an attempt to scale PSRs to such domains. Our goal was to construct a PSR allowing an agent to track its location on the simulated soccer field used in Robocup. This line of work ended in a negative result.
Mayberry III, Marshall R. "Incremental Nonmonotonic Parsing through Semantic Self-Organization". The University of Texas at Austin, Department of Computer Sciences. Report# AI04-310 (dissertation). May 2003. 148 pages.
Subsymbolic systems have been successfully used to model several aspects of human language processing. Subsymbolic parsers are appealing because they allow combining syntactic, semantic, and thematic constraints in sentence interpretation and nonmonotonically revising that interpretation while incrementally processing a sentence. Such parsers are also cognitively plausible: processing is robust and multiple interpretations are simultaneously activated when the input is ambiguous. Yet, it has proven very difficult to scale them up to realistic language. They have limited memory capacity, training takes a long time, and it is difficult to represent linguistic structure. A new connectionist model, INSOMNet, scales up the subsymbolic approach by utilizing semantic self-organization. INSOMNet was trained on semantic dependency graph representations from the recently-released LinGO Redwoods HPSG Treebank of sentences from the VerbMobil project. The results show that INSOMNet accurately learns to represent these semantic dependencies and generalizes to novel structures. Further evaluation of INSOMNet on the original VerbMobil sentences transcribed with annotations for spoken language demonstrates robust parsing of noisy input, while graceful degradation in performance from adding noise to the network weights underscores INSOMNet's tolerance to damage. Finally, the cognitive plausibility of the model is shown on a standard psycholinguistic benchmark, in which INSOMNet demonstrates expectations and defaults, coactivation of multiple interpretations, nonmonotonicity, and semantic priming.
Nahm, Un Yong. "Text Mining with Information Extraction". The University of Texas at Austin, Department of Computer Sciences. Report# AI04-311 (dissertation). October 2004. 132 pages.
The popularity of the Web and the large number of documents available in electronic form has motivated the search for hidden knowledge in text collections. Consequently, there is growing research interest in the general topic of text mining. In this paper, we develop a text-mining system by integrating methods from Information Extraction (IE) and Data Mining (Knowledge Discovery from Databases or KDD). By utilizing existing IE and KDD techniques, text-mining systems can be developed relatively rapidly and evaluated on existing text corpora for testing IE systems. We present a general text-mining framework called DiscoTEX which employs an IE module for transforming natural-language documents into structured data and a KDD module for discovering prediction rules from the extracted data. When discovering patterns in extracted text, strict matching of strings is inadequate because textual database entries generally exhibit variations due to typographical errors, misspellings, abbreviations, and other sources. We introduce the notion of discovering "soft-matching" rules from text and present two new learning algorithms. TextRISE is an inductive method for learning soft-matching prediction rules that integrates rule-based and instance-based learning methods. Simple, interpretable rules are discovered using rule induction, while a nearest-neighbor algorithm provides soft matching. SoftApriori is a text-mining algorithm for discovering association rules from texts that uses a similarity measure to allow flexible matching to variable database items. We present experimental results on inducing prediction and association rules from natural-language texts demonstrating that TextRISE and SoftApriori learn more accurate rules than previous methods for these tasks. We also present an approach to using rules mined from extracted data to improve the accuracy of information extraction. Experimental results demonstate that such discovered patterns can be used to effectively improve the underlying IE method.
Stanley, Kenneth O., Bobby D. Bryant, and Risto Miikkulainen. "The NERO Real-time Video Game". The University of Texas at Austin, Department of Computer Sciences. Report# AI04-312 (technical report). October 2004. 36 pages.
In most modern video games, character behavior is scripted; no matter how many times the player exploits a weakness, that weakness is never repaired. Yet if game characters could learn through interacting with the player, behavior could improve as the game is played, keeping it interesting. This paper introduces the real-time NeuroEvolution of Augmenting Topologies (rtNEAT) method for evolving increasingly complex artificial neural networks in real-time, as a game is being played. The rtNEAT method allows agents to change and improve during the game. In fact, rtNEAT makes possible an entirely new genre of video games in which the player teaches a team of agents through a series of customized training exercises. In order to demonstrate this concept in the NeuroEvolving Robotic Operatives (NERO) game, the player trains a team of robots for combat. This paper describes results from this novel application of machine learning, and demonstrates that rtNEAT makes possible video games like NERO where agents evolve and adapt in real time. In the future, rtNEAT may allow new kinds of educational and training applications.
Stone, Peter, Kurt Dresner, Peggy Fidelman, Nicholas K. Jong, Nate Kohl, Gregory Kuhlmann, Mohan Sridharan, and Daniel Stronger. "The UT Austin Villa 2004 RoboCup Four-Legged Team: Coming of Age". The University of Texas at Austin, Department of Computer Sciences. Report# AI04-313 (technical report). October 27, 2004. 46 pages.
The UT Austin Villa Four-Legged Team for RoboCup 2004 was a second-time entry in the ongoing series of RoboCup legged league competitions. The team development began in mid-January of 2003 without any prior familiarity with the Aibos. After entering a fairly non-competitive team in RoboCup 2003, the team made several important advances. By the July 2004 competition place in Lisbon, Portugal, it was one of the top few teams. In this report, we describe both our development process and the technical details of its end result. In conjunction with our previous technical report this paper provides full documentation of the algorithms behind our approach with the goal of making them fully replicable.
Stanley, Kenneth Owen. "Efficient Evolution of Neural Networks through Complexification". The University of Texas at Austin, Department of Computer Sciences. Report# AI04-314 (dissertation). August 2004. 180 pages.
Artificial neural networks can potentially control autonomous robots, vehicles, factories, or game players more robustly than traditional approaches. Neuroevolution, i.e. the artificial evolution of neural networks, is a method for finding the right topology and connection weights to specify the desired control behavior. The challenge for neuroevolution is that difficult tasks may require complex networks with many connections, all of which must be set to the right values. Even if a network exists that can solve the task, evolution may not be able to find it in such a high-dimensional search space. This dissertation presents the NeuroEvolution of Augmenting Topologies (NEAT) method, which makes search for complex solutions feasible. In a process called complexification, NEAT begins by searching in a space of simple networks, and gradually makes them more complex as the search progresses. By starting minimally, NEAT is more likely to find efficient and robust solutions than neuroevolution methods that begin with large fixed or randomized topologies; by elaborating on existing solutions, it can gradually construct even highly complex solutions. In this dissertation, NEAT is first shown faster than traditional approaches on a challenging reinforcement learning benchmark task. Second, by building on existing structure, it is shown to maintain an "arms race" even in open-ended coevolution. Third, NEAT is used to successfully discover complex behavior in three challenging domains: the game of Go, an automobile warning system, and a real-time interactive video game. Experimental results in these domains demonstrate that NEAT makes entirely new applications of machine learning possible.
Dresner, Kurt and Peter Stone. "Multiagent Traffic Management: Driver Agent Improvements And A Protocol for Intersection Control". The University of Texas at Austin, Department of Computer Sciences. Report# AI04-315 (technical report). May 1, 2005. 16 pages.
Traffic congestion is one of the leading causes of lost productivity and decreased standard of living in urban settings. Recent advances in artificial intelligence suggest vehicle navigation by autonomous agents will be possible in the near future. In a previous paper, we proposed a reservation-based system for alleviating traffic congestion, specifically at intersections. This paper extends our prototype implementation in several ways with the aim of making it more implementable in the real world. In particular, we add the ability of vehicles to turn, enable them to accelerate while in the intersection, improve the efficiency and sensor model of the driver agents, and augment their interaction capabilities with a detailed protocol such that the vehicles do not need to know anything about the intersection control policy. The use of this protocol limits the interaction of the driver agent and the intersection manager to the extent that it is a reasonable approximation of reliable wireless communication. Finally, we describe how different intersection control policies can be expressed with this protocol and limited exchange of information. All improvements are fully implemented and tested, and we present detailed empirical results validating their effectiveness.
Whiteson, Shimon and Peter Stone. "Synthesizing Policy Search and Temporal Difference Methods for Reinforcement Learning". The University of Texas at Austin, Department of Computer Sciences. Report# AI04-316 (technical report). December 14, 2004. 33 pages.
In many machine learning problems, an agent must learn a policy for selecting actions based on its state, which consists of its current knowledge about the world. Reinforcement learning problems are the subset of these tasks for which the agent must learn a policy without ever seeing examples of correct behavior. Instead, it receives only positive and negative feedback for the actions it tries. Since many practical, real world problems (such as robot control, game playing, and system optimization) fall in this category, developing effective reinforcement learning algorithms is critical to the progress of artificial intelligence. Two classes of methods currently exist for tackling reinforcement learning problems. The first class, policy search methods, uses techniques such as genetic algorithms, simulated annealing, or hill climbing to search the space of possible policies for one that performs the required task well. The second class, temporal difference methods, uses dynamic programming and statistical sampling to estimate the long-term value of taking each possible action in each possible state; the agent's policy is derived from the resulting estimates. Temporal difference methods have two important advantages over policy search methods: 1) they can learn in on-line scenarios, where the actions the agent takes during learning have real-world consequences and 2) they can adapt to nonstationary environments, where the optimal policy may be changing. However, existing temporal difference methods also have a significant disadvantage. Learning an effective policy requires having a good representation for it, which temporal difference methods cannot learn. Instead, a human designer must manually (and often tediously) select an appropriate representation. By contrast, policy search methods have recently been developed that can learn the policy and its representation simultaneously. The goal of this article is to develop and test techniques that, by synthesizing policy search and temporal difference methods, can simultaneously reap their contrasting advantages. To this end, we introduce 1) a modification to policy search methods that improves their ability to learn on-line and 2) a novel way to combine this on-line version of policy search with temporal difference methods. In preliminary experiments, we combine NeuroEvolution of Augmenting Topologies (NEAT), a policy search method, with Q-learning, a temporal difference method. In this scenario, NEAT evolves both the topologies and initial weights of a population of neural networks that approximate the agent's value function. During each network's lifetime, its weights are updated via Q-learning. Initial results on a complex elevator control problem demonstrate that NEAT combined with Q-learning in this fashion learns better policies than either NEAT or Q-learning alone. This advantage occurs in both off-line and on-line scenarios and in both stationary and nonstationary environments.
Kaczmarczyk, Lisa C., Risto Miikkulainen, and Mary Z. Last. "The Effect of Delivery Method on Strategy and Conceptual Development". The University of Texas at Austin, Department of Computer Sciences. Report# AI04-318 (technical report). December 2004. 7 pages.
In order to develop intellectual expertise, the student needs to learn how to perform sophisticated pattern identification, and to employ effective study and test taking strategies. These cognitive requirements are complex and analytical. In order to help students succeed in their chosen field, we need to better understand how instruction can best help develop these meta-cognitive skills. This paper reports the results of a study in which novices attempted to categorize calculus integration problems in one of three delivery methods (Drill and Test, Fully Integrated, Incremental Learning). The results demonstrate that Incremental Learners develop the most effective study and test taking strategies, have the best conceptual development, and have the most positive reactions to learning. The results, together with a previously reported computational study, confirm that Incremental Learners develop the best meta-cognitive attributes necessary for expertise.
For help please contact email@example.com