In this talk, we discuss classical resource allocation problems viewed through a finite time learning (regret) perspective. First, we look at (queueing+learning) in a multi-armed bandit setting, where queues need to be matched to servers (aka arms of a bandit, e.g. channels) whose service rates are unknown. We study algorithms that minimize queue-regret: the expected difference between the queue-lengths (backlogs) obtained by the algorithm, and those obtained by a genie-aided matching algorithm that knows exact service rates. A naive view of this problem would suggest that queue-regret could grow logarithmically (since queue-regret cannot be larger than classical regret). Our work shows surprisingly more complex behavior — specifically, the optimal queue-regret decreases with time and scales as O(1/t). Next, we consider a (networks+learning) bandit, where N agents collaborate by exchanging only arm recommendations through pairwise gossip to determine the best resource (aka arm). We establish that even with very limited communications, the regret per agent is a factor of order N smaller compared to the case of no collaborations. Furthermore, we show that the communication constraints only have a second order effect on regret. In both these problems, we discuss the new insights for resource allocation that emerge from a bandit and learning perspective.