Página 1 dos resultados de 3950 itens digitais encontrados em 0.029 segundos

Avaliação de Impacto em projetos sociais do terceiro setor: uma contribuição da teoria de policy learning

Dolabella, Verena Assunção Jacques
Fonte: Fundação Getúlio Vargas Publicador: Fundação Getúlio Vargas
Tipo: Dissertação
Português
Relevância na Pesquisa
66.23%
Esta pesquisa analisará o aprendizado em organizações do terceiro setor a partir de dois eixos principais: 1) Aprendizado através de análises empíricas (Avaliação de Impacto) - Pretende-se entender as principais recomendações para o terceiro setor em relação a Avaliação de Impacto e qual o cenário atual dessas práticas no contexto brasileiro. Para isso, foram entrevistadas 50 instituições que fazem parte de uma rede de organizações na área de educação, para entender como a avaliação vem sendo aplicada; 2) Aprendizado através da dimensão de Policy Learning: Social Learning e Instrumental Learning – Estabelecimento de conexão de teorias de aprendizado em políticas públicas com as instituições do terceiro setor e análise de possibilidades de aprendizado através de outras experiências e boas práticas.; This research will examine learning in nonprofit organizations from two main areas: 1) Learning through empirical analysis (Impact Evaluations) - Aimed at understanding the key recommendations for the third sector in relation to Impact Evaluations and the current scenario of these practices in the Brazilian context. We interviewed 50 institutions that are part of a network of organizations in education to understand how the evaluation is being applied; 2) Learning through the dimension of Policy Learning: Social Learning and Instrumental Learning – Aimed at connecting learning theories in public policy with third sector organizations and learning opportunities through analysis of other experiences and best practices.

Adding Value to Evaluations : Applying the Governmental Learning Spiral for Evaluation-Based Learning

Nashat, Bidjan; Speer, Sandra; Blindenbacher, Raoul
Fonte: World Bank, Washington, DC Publicador: World Bank, Washington, DC
Português
Relevância na Pesquisa
46.19%
Governmental learning has a multidisciplinary research tradition and a plethora of literature exists on organizational as well as policy learning. Different concepts for structured learning from evaluation results on the governmental level exist. It is common to all that they depend on a careful selection of participants and that the political, cultural, and institutional environment is key to the ultimate success of many governmental learning activities. Policy learning can be fostered by various types of organized activities, which range from peer review frameworks often focused on accountability to international learning processes based on concepts like the governmental learning spiral. This paper discusses and analyzes four examples of evaluation-based governmental learning organized in the framework of the World Bank. This contribution will reflect on different streams of learning theories for the governmental level, as they represent assumptions and motivations for organized learning in governments. The governmental learning spiral...

Indigenous education: experiential learning and learning through country

Fogarty, William; Schwab, Robert G.
Fonte: ANU, Centre for Aboriginal Economic Policy Research (CAEPR); http://caepr.anu.edu.au/ Publicador: ANU, Centre for Aboriginal Economic Policy Research (CAEPR); http://caepr.anu.edu.au/
Tipo: Working/Technical Paper; Working/Technical Paper Formato: 24 pages
Português
Relevância na Pesquisa
46.18%
In Indigenous policy circles there is an increasingly desperate desire to lift the educational and employment outcomes of remote Indigenous students, relative to their non-Indigenous peers in the rest of Australia. A lack of engagement with education and a scarcity of jobs underpin this policy anxiety. This paper queries some current policy approaches to these issues and seeks to provide a practical and grounded perspective to education programs in remote Indigenous Australia. We question and challenge the weight current policy agendas are ascribing to literacy and numeracy attainment through direct and classroom based instruction. Alternatively, we seek to reinvigorate the notion that quality education can comprise other modes of learning and include community based educational approaches. As an example we outline the importance of Indigenous land and sea management (ILSM) as a development and employment activity for Indigenous people living in remote regions of Australia, and show how remote education programs are connecting to ILSM to provide local ‘Learning through Country’ solutions. From research conducted in a diversity of remote Aboriginal education and employment contexts, we find that there is a commonality of issues confronting attempts to link education with work and development activity. We finish by giving voice to some of these issues and offer insights relevant for educators and policy makers.

Once bitten, twice shy : financial crises, policy learning and mortgage markets in advanced capitalist economies

BAYRAM, Ismail Emre
Fonte: Instituto Universitário Europeu Publicador: Instituto Universitário Europeu
Tipo: Tese de Doutorado
Português
Relevância na Pesquisa
46.12%
Defence date: 30 April 2014; Examining Board: Professor Sven Steinmo, European University Institute (Supervisor) Professor Pepper Culpepper, European University Institute Professor Peter Englund, Stockholm School of Economics Professor Gunnar Trumbull, Harvard Business School.; Do nations learn from their financial crises? In addressing this question, this dissertation explores whether politicians, supervisors and bankers change their preferences towards financial markets when they recognize they have made significant mistakes in the recent past. It also asks whether such recognition of failure leads to a process of change in rules, policies and institutions, in different national contexts. In addressing this broader theoretical question, the dissertation focuses on the mortgage credit markets in advanced capitalist economies. Challenging the conventional approaches in political science and financial economics, it shows that the longitudinal and cross-sectional variations in mortgage credit markets can best be explained with reference to nations' different experiences of financial crisis. Borrowing insights from learning theory in political economy and public policy analysis, it argues specifically that those nations (i) that had severe financial crises in their recent past and (ii) that have coordinative institutions and elites...

The Power of Ideas: The OECD and Labour Market Policy in Canada, Denmark and Sweden

GRINVALDS, HOLLY S
Fonte: Quens University Publicador: Quens University
Tipo: Tese de Doutorado
Português
Relevância na Pesquisa
56.19%
This thesis advances our understanding of how ideas play a role in policy making by examining the processes and conditions that facilitate their international diffusion into domestic debates, their acceptance by policy actors, and the ways in which their acceptance alters policy processes and policy itself. Specifically, the thesis studies the impact of labour market policy ideas from the Organization for Economic Cooperation and Development (OECD) and its large-scale study on unemployment, the Jobs Study, in three OECD member states: Canada, Denmark and Sweden. This thesis shows that ideas play a number of roles: sometimes they are simply employed to help legitimize pre-determined policy positions; but sometimes a process of learning takes place, and new ideas change actors’ beliefs about what is and what ought to be, and as well their conception of their own interests and goals. Consistent with previous research, policy failure and uncertainty open actors up to the policy learning process and acceptance of new ideas. More than earlier studies, however, this thesis highlights the role of pre-existing beliefs. Accepting one new idea over another is largely determined by the values and beliefs actors bring to bear when judging new ideas; and thus...

Uncertainty and learning in sequential decision-making : the case of climate policy

Webster, Mort David
Fonte: Massachusetts Institute of Technology Publicador: Massachusetts Institute of Technology
Tipo: Tese de Doutorado Formato: 240 p.; 21779590 bytes; 21779345 bytes; application/pdf; application/pdf
Português
Relevância na Pesquisa
46.21%
The debate over a policy response to global climate change has been and continues to be deadlocked between 1) the view that the impacts of climate change are too uncertain and so any policy response should be delayed until we learn more, and 2) the view that we cannot wait to resolve the uncertainty because climate change is irreversible so we must take precautionary measures now. The objective of this dissertation is to sort out the role of waiting for better information in choosing an appropriate level of emissions abatement activities today under uncertainty. In this dissertation, we construct two-period sequential decision models to represent the choice of a level of emissions abatement over the next decade and another choice for the remainder of this century, both empirical models based on a climate model of intermediate complexity, and analytical dynamic programming models. Using the analytical models, we will show that for learning to have an influence on the decision before the learning occurs, an interaction must be present between strategies in the two decision periods. We define an "interaction" as the dependence of the marginal cost or marginal damage of the future decision on today's decision. When an interaction is present and is uncertain...

Learning resources and open access in higher education institutions in Ireland

Risquez, Angelica; McAvinia, Claire; O’Keeffe, Anne; Bruen, Catherine; Desmond, Yvonne; Rooney, Pauline; Flynn, Sharon; Ryan, Deirdre; Farr, Fiona; Quinn, Ann Marcus; Coughlan, Ann
Fonte: NDLR: National Digital Learning Resources Publicador: NDLR: National Digital Learning Resources
Tipo: info:eu-repo/semantics/report; all_ul_research
Português
Relevância na Pesquisa
55.91%
peer-reviewed; Over the past decade or so the open education movement has continued to gather momentum in higher education, spurred on by increasing demand for more flexible education options; by the potential of developments in technology and infrastructure; by advocacy at policy level; and by initiatives and developments at national and international levels. Open educational resources (OER1), one element of the open education movement, have seen exponential growth in this period. Navigating this OER landscape poses a number of important issues and questions for the practice of teaching and learning. From an educational development perspective, the focus rests on investigating how both students and teachers can use and share open educational resources in ways that optimally enhance teaching and learning.

Policy transfer: a tool for political development. A comparative study of child welfare services between Norway and Bolivia

Crespo, Katya Andrea Nogales
Fonte: Instituto Universitário de Lisboa Publicador: Instituto Universitário de Lisboa
Tipo: Dissertação de Mestrado
Publicado em /06/2015 Português
Relevância na Pesquisa
56.1%
The value of social work is not only to reinforce preconceived ideals. Sometimes the best social work practice may be to contribute to fairer and more effective social institutions. If for instance we picture a context in country A of a well ordered social policy and country B were it is not working, the chances of effective social work would be higher in country A. Should and could the shape of the policy in country A then be exported to country B? This is the paramount question I address at two levels in this thesis: First I look and explore the policy issue: the scope, capacity and implementation condition in two different social contexts: Norway and Bolivia using the child welfare service as an exemplar policy area. Then I analyze if such policies, using Norway as a model, may be transfer from the one context to the other within the interest area of policy learning, policy making, policy transfer and political development. I not only found a possible way by which political development can be promoted using policy transfer, but this research contributes with an innovative scope in transfer research, this is transfer as a tool for political development.

Off-Policy Actor-Critic

Degris, Thomas; White, Martha; Sutton, Richard S.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
46.25%
This paper presents the first actor-critic algorithm for off-policy reinforcement learning. Our algorithm is online and incremental, and its per-time-step complexity scales linearly with the number of learned weights. Previous work on actor-critic algorithms is limited to the on-policy setting and does not take advantage of the recent advances in off-policy gradient temporal-difference learning. Off-policy techniques, such as Greedy-GQ, enable a target policy to be learned while following and obtaining data from another (behavior) policy. For many problems, however, actor-critic methods are more practical than action value methods (like Greedy-GQ) because they explicitly represent the policy; consequently, the policy can be stochastic and utilize a large action space. In this paper, we illustrate how to practically combine the generality and learning potential of off-policy learning with the flexibility in action selection given by actor-critic methods. We derive an incremental, linear time and space complexity algorithm that includes eligibility traces, prove convergence under assumptions similar to previous off-policy algorithms, and empirically show better or comparable performance to existing algorithms on standard reinforcement-learning benchmark problems.; Comment: Full version of the paper...

Off-policy Learning with Eligibility Traces: A Survey

Geist, Matthieu; Scherrer, Bruno
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 15/04/2013 Português
Relevância na Pesquisa
46.2%
In the framework of Markov Decision Processes, off-policy learning, that is the problem of learning a linear approximation of the value function of some fixed policy from one trajectory possibly generated by some other policy. We briefly review on-policy learning algorithms of the literature (gradient-based and least-squares-based), adopting a unified algorithmic view. Then, we highlight a systematic approach for adapting them to off-policy learning with eligibility traces. This leads to some known algorithms - off-policy LSTD(\lambda), LSPE(\lambda), TD(\lambda), TDC/GQ(\lambda) - and suggests new extensions - off-policy FPKF(\lambda), BRM(\lambda), gBRM(\lambda), GTD2(\lambda). We describe a comprehensive algorithmic derivation of all algorithms in a recursive and memory-efficent form, discuss their known convergence properties and illustrate their relative empirical behavior on Garnet problems. Our experiments suggest that the most standard algorithms on and off-policy LSTD(\lambda)/LSPE(\lambda) - and TD(\lambda) if the feature space dimension is too large for a least-squares approach - perform the best.

Model-Based Policy Gradients with Parameter-Based Exploration by Least-Squares Conditional Density Estimation

Mori, Syogo; Tangkaratt, Voot; Zhao, Tingting; Morimoto, Jun; Sugiyama, Masashi
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 18/07/2013 Português
Relevância na Pesquisa
46.16%
The goal of reinforcement learning (RL) is to let an agent learn an optimal control policy in an unknown environment so that future expected rewards are maximized. The model-free RL approach directly learns the policy based on data samples. Although using many samples tends to improve the accuracy of policy learning, collecting a large number of samples is often expensive in practice. On the other hand, the model-based RL approach first estimates the transition model of the environment and then learns the policy based on the estimated transition model. Thus, if the transition model is accurately learned from a small amount of data, the model-based approach can perform better than the model-free approach. In this paper, we propose a novel model-based RL method by combining a recently proposed model-free policy search method called policy gradients with parameter-based exploration and the state-of-the-art transition model estimator called least-squares conditional density estimation. Through experiments, we demonstrate the practical usefulness of the proposed method.

Learning Deep Neural Network Policies with Continuous Memory States

Zhang, Marvin; McCarthy, Zoe; Finn, Chelsea; Levine, Sergey; Abbeel, Pieter
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
46.22%
Policy learning for partially observed control tasks requires policies that can remember salient information from past observations. In this paper, we present a method for learning policies with internal memory for high-dimensional, continuous systems, such as robotic manipulators. Our approach consists of augmenting the state and action space of the system with continuous-valued memory states that the policy can read from and write to. Learning general-purpose policies with this type of memory representation directly is difficult, because the policy must automatically figure out the most salient information to memorize at each time step. We show that, by decomposing this policy search problem into a trajectory optimization phase and a supervised learning phase through a method called guided policy search, we can acquire policies with effective memorization and recall strategies. Intuitively, the trajectory optimization phase chooses the values of the memory states that will make it easier for the policy to produce the right action in future states, while the supervised learning phase encourages the policy to use memorization actions to produce those memory states. We evaluate our method on tasks involving continuous control in manipulation and navigation settings...

Scaling Life-long Off-policy Learning

White, Adam; Modayil, Joseph; Sutton, Richard S.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 27/06/2012 Português
Relevância na Pesquisa
56.26%
We pursue a life-long learning approach to artificial intelligence that makes extensive use of reinforcement learning algorithms. We build on our prior work with general value functions (GVFs) and the Horde architecture. GVFs have been shown able to represent a wide variety of facts about the world's dynamics that may be useful to a long-lived agent (Sutton et al. 2011). We have also previously shown scaling - that thousands of on-policy GVFs can be learned accurately in real-time on a mobile robot (Modayil, White & Sutton 2011). That work was limited in that it learned about only one policy at a time, whereas the greatest potential benefits of life-long learning come from learning about many policies in parallel, as we explore in this paper. Many new challenges arise in this off-policy learning setting. To deal with convergence and efficiency challenges, we utilize the recently introduced GTD({\lambda}) algorithm. We show that GTD({\lambda}) with tile coding can simultaneously learn hundreds of predictions for five simple target policies while following a single random behavior policy, assessing accuracy with interspersed on-policy tests. To escape the need for the tests, which preclude further scaling, we introduce and empirically vali- date two online estimators of the off-policy objective (MSPBE). Finally...

Reducing Commitment to Tasks with Off-Policy Hierarchical Reinforcement Learning

Bloch, Mitchell Keith
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 26/04/2011 Português
Relevância na Pesquisa
46.19%
In experimenting with off-policy temporal difference (TD) methods in hierarchical reinforcement learning (HRL) systems, we have observed unwanted on-policy learning under reproducible conditions. Here we present modifications to several TD methods that prevent unintentional on-policy learning from occurring. These modifications create a tension between exploration and learning. Traditional TD methods require commitment to finishing subtasks without exploration in order to update Q-values for early actions with high probability. One-step intra-option learning and temporal second difference traces (TSDT) do not suffer from this limitation. We demonstrate that our HRL system is efficient without commitment to completion of subtasks in a cliff-walking domain, contrary to a widespread claim in the literature that it is critical for efficiency of learning. Furthermore, decreasing commitment as exploration progresses is shown to improve both online performance and the resultant policy in the taxicab domain, opening a new avenue for research into when it is more beneficial to continue with the current subtask or to replan.

From Pixels to Torques: Policy Learning with Deep Dynamical Models

Wahlström, Niklas; Schön, Thomas B.; Deisenroth, Marc Peter
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
46.13%
Data-efficient learning in continuous state-action spaces using very high-dimensional observations remains a key challenge in developing fully autonomous systems. In this paper, we consider one instance of this challenge, the pixels to torques problem, where an agent must learn a closed-loop control policy from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model that uses deep auto-encoders to learn a low-dimensional embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning ensures that not only static but also dynamic properties of the data are accounted for. This is crucial for long-term predictions, which lie at the core of the adaptive model predictive control strategy that we use for closed-loop control. Compared to state-of-the-art reinforcement learning methods for continuous states and actions, our approach learns quickly, scales to high-dimensional state spaces and is an important step toward fully autonomous learning from pixels to torques.; Comment: 9 pages

Rollout Sampling Approximate Policy Iteration

Dimitrakakis, Christos; Lagoudakis, Michail G.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
46.19%
Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schemes without value functions which focus on policy representation using classifiers and address policy learning as a supervised learning problem. This paper proposes variants of an improved policy iteration scheme which addresses the core sampling problem in evaluating a policy through simulation as a multi-armed bandit machine. The resulting algorithm offers comparable performance to the previous algorithm achieved, however, with significantly less computational effort. An order of magnitude improvement is demonstrated experimentally in two standard reinforcement learning domains: inverted pendulum and mountain-car.; Comment: 18 pages, 2 figures, to appear in Machine Learning 72(3). Presented at EWRL08, to be presented at ECML 2008

Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems

Su, Pei-Hao; Vandyke, David; Gasic, Milica; Mrksic, Nikola; Wen, Tsung-Hsien; Young, Steve
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
56.02%
Statistical spoken dialogue systems have the attractive property of being able to be optimised from data via interactions with real users. However in the reinforcement learning paradigm the dialogue manager (agent) often requires significant time to explore the state-action space to learn to behave in a desirable manner. This is a critical issue when the system is trained on-line with real users where learning costs are expensive. Reward shaping is one promising technique for addressing these concerns. Here we examine three recurrent neural network (RNN) approaches for providing reward shaping information in addition to the primary (task-orientated) environmental feedback. These RNNs are trained on returns from dialogues generated by a simulated user and attempt to diffuse the overall evaluation of the dialogue back down to the turn level to guide the agent towards good behaviour faster. In both simulated and real user scenarios these RNNs are shown to increase policy learning speed. Importantly, they do not require prior knowledge of the user's goal.; Comment: Accepted for publication in SigDial 2015

Policy Gradient Methods for Off-policy Control

Lehnert, Lucas; Precup, Doina
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 13/12/2015 Português
Relevância na Pesquisa
46.21%
Off-policy learning refers to the problem of learning the value function of a way of behaving, or policy, while following a different policy. Gradient-based off-policy learning algorithms, such as GTD and TDC/GQ, converge even when using function approximation and incremental updates. However, they have been developed for the case of a fixed behavior policy. In control problems, one would like to adapt the behavior policy over time to become more greedy with respect to the existing value function. In this paper, we present the first gradient-based learning algorithms for this problem, which rely on the framework of policy gradient in order to modify the behavior policy. We present derivations of the algorithms, a convergence theorem, and empirical evidence showing that they compare favorably to existing approaches.

Interactive Policy Learning through Confidence-Based Autonomy

Chernova, Sonia; Veloso, Manuela
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 14/01/2014 Português
Relevância na Pesquisa
46.12%
We present Confidence-Based Autonomy (CBA), an interactive algorithm for policy learning from demonstration. The CBA algorithm consists of two components which take advantage of the complimentary abilities of humans and computer agents. The first component, Confident Execution, enables the agent to identify states in which demonstration is required, to request a demonstration from the human teacher and to learn a policy based on the acquired data. The algorithm selects demonstrations based on a measure of action selection confidence, and our results show that using Confident Execution the agent requires fewer demonstrations to learn the policy than when demonstrations are selected by a human teacher. The second algorithmic component, Corrective Demonstration, enables the teacher to correct any mistakes made by the agent through additional demonstrations in order to improve the policy and future task performance. CBA and its individual components are compared and evaluated in a complex simulated driving domain. The complete CBA algorithm results in the best overall learning performance, successfully reproducing the behavior of the teacher while balancing the tradeoff between number of demonstrations and number of incorrect actions during learning.

Policy Change and Policy Learning in a New Democracy: Response to Extreme Floods in Hungary

Albright, Elizabeth Ann
Fonte: Universidade Duke Publicador: Universidade Duke
Tipo: Dissertação Formato: 2596605 bytes; application/pdf
Publicado em //2009 Português
Relevância na Pesquisa
46.15%

Climate scientists predict increases in frequency and intensity of extreme climatic events over the next century. I used the policy change and policy learning theoretical frameworks--predominantly the advocacy coalition framework (ACF) and the focusing event literature--along with the literature on stakeholder participatory processes, to analyze what policy change occurs and what is learned as a result of experiencing extreme and damaging events. I analyze change in response to catastrophe by examining the response of national and local governments to a series of extreme floods (1998-2002) in a newly democratizing nation, Hungary. I used both qualitative analysis--examination of case studies based on data collected in semi-structured interviews with key informants in the flood and water policy domain--and quantitative analysis--based on a survey of mayors of towns (n=141) in two river basins that had experienced varying degrees of flooding. From these analyses I conclude that the experience of extreme and damaging floods alone was not sufficient to produce policy change and learning. But, a number of factors in concert with the extreme events enabled policy change to occur: (1) The process of democratization allowed alternative voices to be heard in national-level flood policy discussions. (2) A coalition of individuals and organizations espousing an ecological alternative to traditional engineering approaches to flood management that coalesced to press for policy change after the floods occurred. (3) Key policy entrepreneurs...