RL • MARL • QRL • IRL
I am pleased to introduce myself as Can Savcı, an aspiring scholar and technologist specializing in the intersection of Reinforcement Learning and Quantum Computing. My academic journey commenced with securing the 1417th position in the YGS/LYS Exam, leading to my admission into Bilkent University with a Comprehensive Scholarship.
My research focuses on developing intelligent agents that can learn optimal policies through interaction with complex environments. I am particularly fascinated by the Markov Decision Process framework and its applications in real-world scenarios where uncertainty and partial observability present significant challenges.
I am also fascinated by exploration strategies that can learn optimality through interaction with complex environments and credit assignment problem which is the fundamental bottleneck in MARL area.
My current work bridges classical reinforcement learning with quantum-enhanced algorithms, exploring how quantum superposition and entanglement can improve exploration strategies and policy convergence in multi-agent systems. Through rigorous mathematical formulation and experimental validation, I aim to push the boundaries of what's possible in autonomous decision-making systems.
Designing and training individual agents to learn optimal policies in various environments, focusing on value-based, policy-based and actor-critic algorithms.
Developing coordination strategies for multiple agents in shared environments, focusing on Nash equilibria and cooperative policy learning with communication protocols.
Inferring reward functions from expert demonstrations using maximum entropy methods and Bayesian approaches to understand implicit objectives.
Making agent decision-making transparent through attention mechanisms, saliency maps, and interpretable policy representations for human-AI collaboration.
Ensuring reliable performance under distributional shift, adversarial perturbations, and environmental uncertainty through risk-aware optimization.
Leveraging quantum superposition and entanglement to enhance exploration, accelerate convergence, and solve exponentially complex state spaces.
10. Uluslararası Bilgisayar Bilimleri ve Mühendisliği Konferansı (UBMK 2025)
Abstract: This research advances multi-agent systems for marine plastic and solid waste collection through a novel quantum-enhanced reinforcement learning framework. We developed a hybrid architecture featuring GRU-based actors and attention-mechanism critics, systematically addressing the fundamental challenges of partial observability, insufficient exploration, and policy collapse in traditional MARL methods.
Key Innovation: The integration of quantum circuits with MARL enables unprecedented exploration capacity beyond classical RL limitations. Our quantum-enhanced structure achieves more stable policy learning while enabling discovery of strategies unreachable by classical methods, demonstrating measurable improvements in coordination efficiency and environmental impact reduction.
Impact: COMA++ represents a significant contribution to both the academic understanding of quantum-classical hybrid learning systems and practical environmental sustainability applications, providing a scalable framework for autonomous marine cleanup operations.
Delivering comprehensive Reinforcement Learning courses covering: fundamentals, value-based methods (Q-Learning, DQN), policy-based methods (REINFORCE, Actor-Critic, TRPO, PPO, DDPG, SAC, TD3, A3C), advanced topics (Hierarchical RL, Behavioral Cloning, Reward Design), and practical applications with real datasets using CQL. Guiding students through both theory and hands-on deployment challenges.
Supported AI courses as an assistant mentor, helping students with assignments, project guidance, and clarifying machine learning concepts remotely.
Developed advanced computer vision architectures including ResNet, UNet, YOLO, GAN and VAE for complex image processing tasks. Applied both traditional and modern CV techniques to solve localization, cleaning, comparison, colorization, and face swapping challenges.
Implemented state-of-the-art Intent Detection and Slot Filling models using BERT and GPT2. Optimized deployment on Jetson TX2 using OnnxRuntime, PyTorch and TensorRT. Managed end-to-end ML pipeline from data creation to model deployment.
Developed mission-critical defense systems using Java 8, OSGi, and Swing frameworks. Mastered advanced software engineering practices including clean code principles, design patterns, and multithreading optimization.
Developed smart contracts in Solidity and implemented Blockchain-as-a-Service APIs. Conducted research on Elliptic Curve Diffie-Hellman protocols and delivered technical presentations on cryptographic implementations.
+905307801024
highcsavci@gmail.com
Connect for collaborations
Explore implementations